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CRYSTALLIZATION OF HISTONE DEACETYLASE 2 
FIELD OF THE INVENTION 

[0001] The present invention relates to histone deacetylases ("HDAC") and more 
specifically to a particular HDAC known as HDAC-2. Provided is HDAC-2 in crystalline 
form, methods of forming crystals comprising HDAC-2, methods of using crystals comprising 
HDAC-2, a crystal structure of HDAC-2, and methods of using the crystal structure. 

BACKGROUND OF THE INVENTION 

[0002] A general approach to designing inhibitors that are selective for a given protein is to 
determine how a putative inhibitor interacts with a three dimensional structure of that protein. 
For this reason it is useful to obtain the protein in crystalline form and perform X-ray 
diffraction techniques to determine the protein's three dimensional structure coordinates. 
Various methods for preparing crystalline proteins are known in the art. 

[0003] Once protein crystals are produced, crystallographic data can be generated using the 
crystals to provide useful structural information that assists in the design of small molecules 
that bind to the active site of the protein and inhibit the protein's activity in vivo. If the protein 
is crystallized as a complex with a ligand, one can determine both the shape of the protein's 
binding pocket when bound to the ligand, as well as the amino acid residues that are capable of 
close contact with the ligand. By knowing the shape and amino acid residues comprised in the 
binding pocket, one may design new ligands that will interact favorably with the protein. With 
such structural information, available computational methods may be used to predict how 
strong the ligand binding interaction will be. Such methods aid in the design of inhibitors that 
bind strongly, as well as selectively to the protein. 

SUMMARY OF THE INVENTION 

[0004] The present invention is directed to crystals comprising HDAC-2 and particularly 
crystals comprising HDAC-2 that have sufficient size and quality to obtain useful information 
about the structural properties of HDAC-2 and molecules or complexes that may associate with 
HDAC-2. 
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[0005] In one embodiment, a composition is provided that comprises a protein in crystalline 
form wherein at least a portion of the protein has 55%, 65%, 75%, 85%, 90%, 95%, 97%, 99% 
or greater identity with SEQ. ID No. 4. 

[0006] In one embodiment, a composition is provided that comprises a protein in crystalline 
form wherein at least a portion of the protein has 55%, 65%, 75%, 85%, 90%, 95%, 97%, 99% 
or greater identity with SEQ. ID No. 5. 

[0007] In one variation, the protein has activity characteristic of HDAC-2. For example, the 
protein may optionally be inhibited by inhibitors of wild type HDAC-2. The protein crystal 
may also diffract X-rays for a determination of structure coordinates to a resolution of 4A, 
3.5A, 3.0A or less. 

[0008] In one variation, the protein crystal has a crystal lattice in a P2i space group. The 
protein crystal may also have a crystal lattice having unit cell dimensions, +/- 5%, of a=79.9A, 
b=56.9A, c=95.2A, a=90°, 0=90.5°, and y=90°. 

[0009] In one variation, the protein crystal has a crystal lattice in a P2i2i2i space group. 
The protein crystal may also have a crystal lattice having unit cell dimensions, +/- 5%, of 
a=92.lA, b= 97.6A, c=138.9A, and a=p=ry=90°. 

[0010] The present invention is also directed to crystallizing HDAC-2. The present 
invention is also directed to the conditions useful for crystallizing HDAC-2. It should be 
recognized that a wide variety of crystallization methods can be used in combination with the 
crystallization conditions to form crystals comprising HDAC-2 including, but not limited to, 
vapor diffusion, batch, dialysis, and other methods of contacting the protein solution for the 
purpose of crystallization. 

[0011] In one embodiment, a method is provided for forming crystals of a protein 
comprising: forming a crystallization volume comprising: a protein wherein at least a portion 
of the protein has 55%, 65%, 75%, 85%, 90%, 95%, 97%, 99% or greater identity with SEQ. 
ID No. 4; and storing the crystallization volume under conditions suitable for crystal formation. 
[0012] In one variation, the crystallization volume comprises the protein in a concentration 
between 1 mg/ml and 50 mg/ml, and 5-50% w/v of precipitant wherein the precipitant 
comprises one or more members of the group consisting of PEG having a molecular weight 
range between 200-20000, 2-methyl-2,4-pentanediol (MPD) and isopropanol, and wherein the 
crystallization volume has a pH between pH 4 and pH 10. 
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[0013] In one embodiment, a method is provided for forming crystals of a protein 
comprising: forming a crystallization volume comprising: a protein wherein at least a portion 
of the protein has 55%, 65%, 75%, 85%, 90%, 95%, 97%, 99% or greater identity with SEQ. 
ID No. 5; and storing the crystallization volume under conditions suitable for crystal formation. 
[0014] In one variation, the crystallization volume comprises the protein in a concentration 
between 1 mg/ml and 50 mg/ml, and 5-50% w/v of precipitant wherein the precipitant 
comprises one or more members of the group consisting of PEG having a molecular weight 
range between 200-20000, 2-methyl-2,4-pentanediol (MPD) and isopropanol, and wherein the 
crystallization volume has a pH between pH 4 and pH 10. 

[0015] The method may optionally further comprise forming a protein crystal that has a 
crystal lattice in a P2i space group. The method also optionally further comprises forming a 
protein crystal that has a crystal lattice having unit cell dimensions, +/- 5%, of a=79.9A, 
b=56.9A, c=95.2A, a=90°, p=90.5°, and y=90°. of. The invention also relates to protein crystals 
formed by these methods. 

[0016] The method may optionally further comprise forming a protein crystal that has a 
crystal lattice in a P2i2i2i space group. The method also optionally further comprises forming 
a protein crystal that has a crystal lattice having unit cell dimensions, +/- 5%, of a=92.lA, b= 
97.6A, c= 138.9 A, and a=p=y=90°. The invention also relates to protein crystals formed by 
these methods. 

[0017] The present invention is also directed to a composition comprising an isolated 
protein that comprises or consists of one or more of the protein sequence(s) of HDAC-2 taught 
herein for crystallizing HDAC-2. The present invention is also directed to a composition 
comprising an isolated nucleic acid molecule that comprises or consists of the nucleotides for 
expressing the protein sequence of HDAC-2 taught herein for crystallizing HDAC-2. 
[0018] The present invention is also directed to an expression vector that may be used to 
express the isolated proteins taught herein for crystallizing HDAC-2. In one variation, the 
expression vector comprises a promoter that promotes expression of the isolated protein. 
[0019] The present invention is also directed to a cell line transformed or transfected by an 
isolated nucleic acid molecule or expression vector of the present invention. 
[0020] The present invention is also directed to structure coordinates for HDAC-2 as well as 
structure coordinates that are comparatively similar to these structure coordinates. It is noted 
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that these comparatively similar structure coordinates may encompass proteins with similar 
sequences and/or structures, such as other histone deacetylases. For example, machine- 
readable data storage media is provided having data storage material encoded with machine- 
readable data that comprises structure coordinates that are comparatively similar to the 
structure coordinates of HDAC-2. The present invention is also directed to a machine readable 
data storage medium having data storage material encoded with machine readable data, which, 
when read by an appropriate machine, can display a three dimensional representation of all or a 
portion of a structure of HDAC-2 or a model that is comparatively similar to the structure of all 
or a portion of HDAC-2. 

[0021] Various embodiments of machine readable data storage medium are provided that 
comprise data storage material encoded with machine readable data. The machine readable 
data comprises: structure coordinates that have a root mean square deviation equal to or less 
than the RMSD value specified in Columns 3, 4 or 5 of Table 1 when compared to the structure 
coordinates of Figure 3, the root mean square deviation being calculated such that the portion 
of amino acid residues specified in Column 2 of Table 1 of each set of structure coordinates are 
superimposed and the root mean square deviation is based only on those amino acid residues in 
the structure coordinates that are also present in the portion of the protein specified in Column 
1 of Table 1. The amino acids being overlayed and compared need not be identical when the 
RMSD calculation is performed on alpha carbons and main chain atoms but the amino acids 
being overlayed and compared must have identical side chains when the RMSD calculation is 
performed on all non-hydrogen atoms. 

[0022] For example, in one embodiment where the comparison is based on the 10 Angstrom 
set of amino acid residues (Column 1) and is based on superimposing alpha-carbon atoms 
(Column 2), the structure coordinates may have a root mean square deviation equal to or less 
than 0.45 when compared to the structure coordinates of Figure 3. 
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TABLE 1 



AA RESIDUES TO USE 
TO PERFORM RMSD 
COMPARISON 


PORTION OF EACH AA 
RESIDUE USED TO PERFORM 
RMSD COMPARISON 


RMSD VALUE LESS THAN 
OR EQUAL TO 


Table 2 
(4 Angstrom set) 


alpha-carbon atoms 1 


0.33 


0.22 


0.17 i 


main-chain atoms 1 


0 ^5 

\J. JmJ 


0 OA 


U.lO 


all non-hydrogen ; 


0.49 


0.33 


0.25 


Table 3 
(7 Angstrom set) 


alpha-carbon atoms 1 ■ 


0.39 


0.26 


0.20 


main-chain atoms 


0.44 


0.30 


0.22 i 


all non-hydrogen^ j 


0.60 


0.40 


0.30 


Table 4 
(10 Angstrom set) 


alpha-carbon atoms 1 


0.45 


0.30 


0.23 


main-chain atoms 1 


0.51 


0.34 


0.25 


all non-hydrogen^ 


0.63 


0.42 


0.31 1 


SEQ. ID No. 3 


alpha-carbon atoms 1 


1.13 


0.75 


0.56 


main-chain atoms 1 ! 


1.11 


0.74 


0.56 I 


all non-hydrogen 2 


1.24 


0.82 


0.62 



1 - the RMSD computed between the atoms of all amino acids that are common to both the target and the reference 
in the aligned and superposed structure. The amino acids need not be identical. 

2 - the RMSD computed only between identical amino acids, which are common to both the target and the 
reference in the aligned and superposed structure. 



[0023] The present invention is also directed to a three-dimensional structure of all or a 
portion of HDAC-2. This three-dimensional structure may be used to identify binding sites, to 
provide mutants having desirable binding properties, and ultimately, to design, characterize, or 
identify ligands capable of interacting with HDAC-2. Ligands that interact with HDAC-2 may 
be any type of atom, compound, protein or chemical group that binds to or otherwise associates 
with the protein. Examples of types of ligands include natural substrates for HDAC-2, 
inhibitors of HDAC-2, and heavy atoms. The inhibitors of HDAC-2 may optionally be used as 
drugs to treat therapeutic indications by modifying the in vivo activity of HDAC-2. 
[0024] In various embodiments, methods are provided for displaying a three dimensional 
representation of a structure of a protein comprising: 

taking machine readable data comprising structure coordinates that have a root mean 
square deviation equal to or less than the RMSD value specified in Columns 3, 4 or 5 of Table 
1 when compared to the structure coordinates of Figure 3, the root mean square deviation being 
calculated such that the portion of amino acid residues specified in Column 2 of Table 1 of 
each set of structure coordinates are superimposed and the root mean square deviation is based 
only on those amino acid residues in the structure coordinates that are also present in the 
portion of the protein specified in Column 1 of Table 1; 
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computing a three dimensional representation of a structure based on the structure 
coordinates; and 

displaying the three dimensional representation. 
[0025] The present invention is also directed to a method for solving a three-dimensional 
crystal structure of a target protein using the structure of HDAC-2. 

[0026] In various embodiments, computational methods are provided comprising: taking 
machine readable data comprising structure coordinates that have a root mean square deviation 
equal to or less than the RMSD value specified in Columns 3, 4 or 5 of Table 1 when compared 
to the structure coordinates of Figure 3, the root mean square deviation being calculated such 
that the portion of amino acid residues specified in Column 2 of Table 1 of each set of structure 
coordinates are superimposed and the root mean square deviation is based only on those amino 
acid residues in the structure coordinates that are also present in the portion of the protein 
specified in Column 1 of Table 1; 

computing phases based on the structural coordinates; 
computing an electron density map based on the computed phases; and 
determining a three-dimensional crystal structure based on the computed electron 
density map. 

[0027] In various embodiments, computational methods are provided comprising: taking an 
X-ray diffraction pattern of a crystal of the target protein; and computing a three-dimensional 
electron density map from the X-ray diffraction pattern by molecular replacement, wherein 
structure coordinates used as a molecular replacement model comprise structure coordinates 
that have a root mean square deviation equal to or less than the RMSD value specified in 
Columns 3, 4 or 5 of Table 1 when compared to the structure coordinates of Figure 3, the root 
mean square deviation being calculated such that the portion of amino acid residues specified in 
Column 2 of Table 1 of each set of structure coordinates are superimposed and the root mean 
square deviation is based only on those amino acid residues in the structure coordinates that are 
also present in the portion of the protein specified in Column 1 of Table 1. 
[0028] These methods may optionally further comprise determining a three-dimensional 
crystal structure based upon the computed three-dimensional electron density map. 
[0029] The present invention is also directed to using a crystal structure of HDAC-2, in 
particular the structure coordinates of HDAC-2 and the surface contour defined by them, in 
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methods for screening, designing, or optimizing molecules or other chemical entities that 
interact with and preferably inhibit HDAC-2. 

[0030] One skilled in the art will appreciate the numerous uses of the inventions described 
herein, particularly in the areas of drug design, screening and optimization of drug candidates, 
as well as in determining additional unknown crystal structures. For example, a further aspect 
of the present invention relates to using a three-dimensional crystal structure of all or a portion 
of HDAC-2 and/or its structure coordinates to evaluate the ability of entities to associate with 
HDAC-2. The entities may be any entity that may function as a ligand and thus may be any 
type of atom, compound, protein (such as antibodies) or chemical group that can bind to or 
otherwise associate with a protein. 

[0031] In various embodiments, methods are provided for evaluating a potential of an entity 
to associate with a protein comprising: 

creating a computer model of a protein structure using structure coordinates that 
comprise structure coordinates that have a root mean square deviation equal to or less than the 
RMSD value specified in Columns 3, 4 or 5 of Table 1 when compared to the structure 
coordinates of Figure 3, the root mean square deviation being calculated such that the portion 
of amino acid residues specified in Column 2 of Table 1 of each set of structure coordinates are 
superimposed and the root mean square deviation is based only on those amino acid residues in 
the structure coordinates that are also present in the portion of the protein specified in Column 
1 of Table 1; 

performing a fitting operation between the entity and the computer model; and 
analyzing results of the fitting operation to quantify an association between the entity 
and the model. 

[0032] In other embodiments, methods are provided for identifying entities that can 
associate with a protein comprising: 

generating a three-dimensional structure of a protein using structure coordinates that 
comprise structure coordinates that have a root mean square deviation equal to or less than the 
RMSD value specified in Columns 3, 4 or 5 of Table 1 when compared to the structure 
coordinates of Figure 3, the root mean square deviation being calculated such that the portion 
of amino acid residues specified in Column 2 of Table 1 of each set of structure coordinates are 
superimposed and the root mean square deviation is based only on those amino acid residues in 
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the structure coordinates that are also present in the portion of the protein specified in Column 
1 of Table 1; 

employing the three-dimensional structure to design or select an entity that can 
associate with the protein; and 

contacting the entity with a protein wherein at least a portion of the protein has 55%, 
65%, 75%, 85%, 90%, 95%, 97%, 99% or greater identity with SEQ. ID No. 4. 
[0033] In other embodiments, methods are provided for identifying entities that can 
associate with a protein comprising: 

generating a three-dimensional structure of a protein using structure coordinates that 
comprise structure coordinates that have a root mean square deviation equal to or less than the 
RMSD value specified in Columns 3, 4 or 5 of Table 1 when compared to the structure 
coordinates of Figure 3, the root mean square deviation being calculated such that the portion 
of amino acid residues specified in Column 2 of Table 1 of each set of structure coordinates are 
superimposed and the root mean square deviation is based only on those amino acid residues in 
the structure coordinates that are also present in the portion of the protein specified in Column 
1 of Table 1; 

employing the three-dimensional structure to design or select an entity that can 
associate with the protein; and 

contacting the entity with a protein wherein at least a portion of the protein has 55%, 
65%, 75%, 85%, 90%, 95%, 97%, 99% or greater identity with SEQ. ID No. 5. 
[0034] In other embodiments, methods are provided for identifying entities that can 
associate with a protein comprising: 

generating a three-dimensional structure of a protein using structure coordinates that 
comprise structure coordinates that have a root mean square deviation equal to or less than the 
RMSD value specified in Columns 3, 4 or 5 of Table 1 when compared to the structure 
coordinates of Figure 3, the root mean square deviation being calculated such that the portion 
of amino acid residues specified in Column 2 of Table 1 of each set of structure coordinates are 
superimposed and the root mean square deviation is based only on those amino acid residues in 
the structure coordinates that are also present in the portion of the protein specified in Column 
1 of Table l;and 
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employing the three-dimensional structure to design or select an entity that can 
associate with the protein. 

[0035] In other embodiments, methods are provided for identifying entities that can 
associate with a protein comprising: 

computing a computer model for a protein binding pocket, at least a portion of the 
computer model having a surface contour that has a root mean square deviation equal to or less 
than a given RMSD value specified in Columns 3, 4 or 5 of Table 1 when the coordinates used 
to compute the surface contour are compared to the structure coordinates of Figure 3, wherein 
(a) the root mean square deviation is calculated by the calculation method set forth herein, (b) 
the portion of amino acid residues associated with the given RMSD value in Table 1 (specified 
in Column 2 of Table 1) are superimposed according to the RMSD calculation, and (c) the root 
mean square deviation is calculated based only on those amino acid residues present in both the 
protein being modeled and the portion of the protein associated with the given RMSD in Table 
1 (specified in Column 1 of Table 1); 

employing the computer model to design or select an entity that can associate with the 
protein; and 

contacting the entity with a protein wherein at least a portion of the protein has 55%, 
65%, 75%, 85%, 90%, 95%, 97%, 99% or greater identity with SEQ. ID No. 4. 
[0036] In other embodiments, methods are provided for identifying entities that can 
associate with a protein comprising: 

computing a computer model for a protein binding pocket, at least a portion of the 
computer model having a surface contour that has a root mean square deviation equal to or less 
than a given RMSD value specified in Columns 3, 4 or 5 of Table 1 when the coordinates used 
to compute the surface contour are compared to the structure coordinates of Figure 3, wherein 
(a) the root mean square deviation is calculated by the calculation method set forth herein, (b) 
the portion of amino acid residues associated with the given RMSD value in Table 1 (specified 
in Column 2 of Table 1) are superimposed according to the RMSD calculation, and (c) the root 
mean square deviation is calculated based only on those amino acid residues present in both the 
protein being modeled and the portion of the protein associated with the given RMSD in Table 
1 (specified in Column 1 of Table 1); 
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employing the computer model to design or select an entity that can associate with the 
protein; and 

contacting the entity with a protein wherein at least a portion of the protein has 55%, 
65%, 75%, 85%, 90%, 95%, 97%, 99% or greater identity with SEQ. ID No. 5. 
[0037] In other embodiments, methods are provided for identifying entities that can 
associate with a protein comprising: 

computing a computer model for a protein binding pocket, at least a portion of the 
computer model having a surface contour that has a root mean square deviation equal to or less 
than a given RMSD value specified in Columns 3, 4 or 5 of Table 1 when the coordinates used 
to compute the surface contour are compared to the structure coordinates of Figure 3, wherein 
(a) the root mean square deviation is calculated by the calculation method set forth herein, (b) 
the portion of amino acid residues associated with the given RMSD value in Table 1 (specified 
in Column 2 of Table 1) are superimposed according to the RMSD calculation, and (c) the root 
mean square deviation is calculated based only on those amino acid residues present in both the 
protein being modeled and the portion of the protein associated with the given RMSD in Table 
1 (specified in Column 1 of Table 1); and 

employing the computer model to design or select an entity that can associate with the 
protein. 

[0038] In other embodiments, methods are provided for evaluating the ability of an entity to 
associate with a protein, the method comprising: 

constructing a computer model defined by structure coordinates that have a root mean 
square deviation equal to or less than the RMSD value specified in Columns 3, 4 or 5 of Table 
1 when compared to the structure coordinates of Figure 3, the root mean square deviation being 
calculated such that the portion of amino acid residues specified in Column 2 of Table 1 of 
each set of structure coordinates are superimposed and the root mean square deviation is based 
only on those amino acid residues in the structure coordinates that are also present in the 
portion of the protein specified in Column 1 of Table 1; 

selecting an entity to be evaluated by a method selected from the group consisting of (i) 
assembling molecular fragments into the entity, (ii) selecting an entity from a small molecule 
database, (iii) de novo ligand design of the entity, and (iv) modifying a known ligand for 
HDAC-2, or a portion thereof; 
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performing a fitting program operation between computer models of the entity to be 
evaluated and the binding pocket in order to provide an energy-minimized configuration of the 
entity in the binding pocket; and 

evaluating the results of the fitting operation to quantify the association between the 
entity and the binding pocket model in order to evaluate the ability of the entity to associate 
with the binding pocket. 

[0039] In other embodiments, methods are provided for evaluating the ability of an entity to 
associate with a protein, the method comprising: 

computing a computer model for a protein binding pocket, at least a portion of the 
computer model having a surface contour that has a root mean square deviation equal to or less 
than a given RMSD value specified in Columns 3, 4 or 5 of Table 1 when the coordinates used 
to compute the surface contour are compared to the structure coordinates of Figure 3, wherein 
(a) the root mean square deviation is calculated by the calculation method set forth herein, (b) 
the portion of amino acid residues associated with the given RMSD value in Table 1 (specified 
in Column 2 of Table 1) are superimposed according to the RMSD calculation, and (c) the root 
mean square deviation is calculated based only on those amino acid residues present in both the 
protein being modeled and the portion of the protein associated with the given RMSD in Table 
1 (specified in Column 1 of Table 1); 

selecting an entity to be evaluated by a method selected from the group consisting of (i) 
assembling molecular fragments into the entity, (ii) selecting an entity from a small molecule 
database, (iii) de novo ligand design of the entity, and (iv) modifying a known ligand for 
HDAC-2, or a portion thereof; 

performing a fitting program operation between computer models of the entity to be 
evaluated and the binding pocket in order to provide an energy-minimized configuration of the 
entity in the binding pocket; and 

evaluating the results of the fitting operation to quantify the association between the 
entity and the binding pocket model in order to evaluate the ability of the entity to associate 
with the binding pocket. 

[0040] In regard to each of these embodiments, the protein may optionally have activity 
characteristic of HDAC-2. For example, the protein may optionally be inhibited by inhibitors 
of wild type HDAC-2. 
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[0041] In another embodiment, a method is provided for identifying an entity that associates 
with a protein comprising: taking structure coordinates from diffraction data obtained from a 
crystal of a protein wherein at least a portion of the protein has 55%, 65%, 75%, 85%, 90%, 
95%, 97%, 99% or greater identity with SEQ. ED No. 4; and performing rational drug design 
using a three dimensional structure that is based on the obtained structure coordinates. 
[0042] In another embodiment, a method is provided for identifying an entity that associates 
with a protein comprising: taking structure coordinates from diffraction data obtained from a 
crystal of a protein wherein at least a portion of the protein has 55%, 65%, 75%, 85%, 90%, 
95%, 97%, 99% or greater identity with SEQ. ID No. 5; and performing rational drug design 
using a three dimensional structure that is based on the obtained structure coordinates. 
[0043] The protein crystals may optionally have a crystal lattice with a P2i space group and 
unit cell dimensions, +/- 5%, of a=79.9A, b=56.9A, c=95.2A, a=90°, p=90.5°, and 7=90°. 
[0044] The protein crystals may optionally have a crystal lattice with a P2i2i2i space group 
and unit cell dimensions, +/- 5%, of a=92. 1 A, b= 97.6A, c=138.9A, and a=P=y=90°. 
[0045] The method may optionally further comprise selecting one or more entities based on 
the rational drug design and contacting the selected entities with the protein. The method may 
also optionally further comprise measuring an activity of the protein when contacted with the 
one or more entities. The method also may optionally further comprise comparing activity of 
the protein in a presence of and in the absence of the one or more entities; and selecting entities 
where activity of the protein changes depending whether a particular entity is present. The 
method also may optionally further comprise contacting cells expressing the protein with the 
one or more entities and detecting a change in a phenotype of the cells when a particular entity 
is present. 

BRIEF DESCRIPTION OF THE FIGURES 

[0046] Figure 1 illustrates SEQ. ID Nos. 1, 2, 3, 4, and 5 referred to in this application. 
[0047] Figure 2A illustrates a crystal formed according to Example 1 with a crystal lattice in 
a P2i space group. 

[0048] Figure 2B illustrates a crystal formed according to Example 2 with a crystal lattice in 
a P2i2i2i space group. 
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[0049] Figure 3 lists a set of atomic structure coordinates for HDAC-2 as derived by X-ray 
crystallography from a crystal that comprises the protein. The following abbreviations are used 
in Figure 3: "X, Y, Z" crystallographically define the atomic position of the element measured; 
"B" is a thermal factor that measures movement of the atom around its atomic center; "Occ" is 
an occupancy factor that refers to the fraction of the molecules in which each atom occupies the 
position specified by the coordinates (a value of "1" indicates that each atom has the same 
conformation, i.e., the same position, in all molecules of the crystal). 

[0050] Figure 4A illustrates a ribbon diagram overview of the structure of HDAC-2, 
highlighting the secondary structural elements of the protein. 

[0051] Figure 4B illustrates a ribbon diagram overview of the structure of HDAC-2 rotated 
90° with respect to the view in Figure 4A, highlighting the secondary structural elements of the 
protein. 

[0052] Figure 5 illustrates key interactions between groups in the binding pocket and the 
TSA molecule. 

[0053] Figure 6 illustrates a system that may be used to carry out instructions for displaying 
a crystal structure of HDAC-2 encoded on a storage medium. 

DETAILED DESCRIPTION OF THE INVENTION 

[0054] The present invention relates to a member of the histone deacetylase (HDAC) family 
known as HDAC-2. More specifically, present invention relates to HDAC-2 in crystalline 
form, methods of forming crystals comprising HDAC-2, methods of using crystals comprising 
HDAC-2, a crystal structure of HDAC-2, and methods of using the crystal structure. 
[0055] In describing protein structure and function herein, reference is made to amino acids 
comprising the protein. The amino acids may also be referred to by their conventional 
abbreviations; A = Ala = Alanine; T = Thr = Threonine; V = Val = Valine; C = Cys = Cysteine; 
L = Leu = Leucine; Y = Tyr = Tyrosine; I = lie = Isoleucine; N = Asn = Asparagine; P = Pro = 
Proline; Q = Gin = Glutamine; F = Phe = Phenylalanine; D = Asp = Aspartic Acid; W = Trp = 
Tryptophan; E = Glu = Glutamic Acid; M = Met = Methionine; K = Lys = Lysine; G = Gly = 
Glycine; R = Arg = Arginine; S = Ser = Serine; and H = His = Histidine. 
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1. HDAC-2 

[0056] Histone deacetylases (HDACs) play important roles in the modulation of chromatin 
structure and the regulation of gene expression. HDACs are involved in cell-cycle progression 
and differentiation, and their deregulation is associated with several different forms of cancer. 
HDACs are known to acetylate the e-amino group of lysine residues. 

[0057] Seventeen human genes that encode proven or putative histone deacetylases 
(HDACs) have been identified to date, some of which are described in Johnstone, R. W., 
"Histone-Deacetylase Inhibitors: Novel Drugs for the Treatment of Cancer", Nature Reviews, 
Volume I, pp. 287-299, (2002) and PCT Publication Nos. 00/10583, 01/18045, 01/42437 and 
02/08273. 

[0058] HDACs have been categorized into three distinct classes based on size and sequence 
homology. Class I of the HDAC family includes HDACs 1,2, 3 and 8. 

[0059] HDAC-2 is a 488 residue, 55 kDa protein localized to the nucleus of a wide array of 
tissues, as well as several human tumor cell lines. The wild-type form of full length HDAC-2 
is set forth as SEQ. ID No. 1 (GenBank Accession Number NM 001527; Furukawa, Y. et al. 
"Isolation and mapping of a human gene (RPD3L1) that is homologous to RPD3, a 
transcription factor in Saccharomyces cerevisiae" Cryogenet. Cell Genet. 73 (1-2), 130-133 
(1996)). Zn is likely native to the protein and required for HDAC-2 activity. All class I 
HDACs (including HDAC-2) appear to be sensitive to inhibition by trichostatin A (TSA), 
which is the ligand present in the structure described herein. 

[0060] It should be understood that the methods and compositions provided relating to 
HDAC-2 are not intended to be limited to the wild type, full length form of HDAC-2. Instead, 
the present invention also relates to fragments and variants of HDAC-2 as described herein. 
[0061] In one embodiment, HDAC-2 comprises the wild-type form of full length HDAC-2, 
set forth herein as SEQ. ID No. 1. 

[0062] It should be recognized that the invention may be readily extended to various 
variants of wild-type HDAC-2 and variants of fragments thereof. In another embodiment, 
HDAC-2 comprises a sequence that has at least 65% identity, preferably at least 70%, 80%, 
90%, 95% or higher identity with SEQ. ID No. 1. In another embodiment, HDAC-2 comprises 
a sequence that has at least 65% identity, preferably at least 70%, 80%, 90%, 95% or higher 
identity with SEQ. ID No. 4. In yet another embodiment, HDAC-2 comprises a sequence that 
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has at least 65% identity, preferably at least 70%, 80%, 90%, 95% or higher identity with SEQ. 
ID No. 5. 

[0063] It is also noted that the above sequences of HDAC-2 is also intended to encompass 
isoforms, mutants and fusion proteins of these sequences. An example of a fusion protein is 
provided by SEQ. ID No. 3 which includes a 7 residue C-terminal tag (GHHHHHH) that may 
be used to facilitate purification of the protein. 

[0064] With the crystal structure provided herein, where amino acid residues are positioned 
in the structure are now known. As a result, the impact of different substitutions can be more 
easily predicted and understood. 

[0065] For example, based on the crystal structure, applicants have determined that HDAC- 
2 has one binding pocket capable of binding to a TSA molecule. Figure 5 illustrates a TSA 
molecule bound to the HDAC-2 binding pocket. 

[0066] The amino acids shown in Table 2 were found to be within 4 Angstroms of the 
binding pocket and therefore close enough to interact with TSA. Applicants have also 
determined that the amino acids of Table 3 are within 7 Angstroms of TSA bound in the 
binding pocket and therefore are also close enough to interact with that substrate or analogs 
thereof. Further it has been determined that the amino acids of Table 4 are within 10 
Angstroms of the bound TSA in the binding pocket. 

[0067] One or more of these sets of amino acids is preferably conserved in a variant of 
HDAC-2. Hence, HDAC-2 may optionally comprise a sequence wherein at least a portion of 
the protein that has at least 65% identity, preferably at least 70%, 80%, 90%, 95% or higher 
identity with any one of the above sequences (e.g., all of SEQ. ID No. 1, SEQ. ID No. 4, or 
SEQ. ED No. 5), where at least the residues shown in Tables 2, 3, and 4 are conserved with the 
exception of 0, 1, 2, 3, or 4 residues. It should be recognized that one might optionally vary 
some of the binding site residues in order to determine the effect such changes have on 
structure or activity. 
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Table 2: Amino Acids encompassed by a 4- Angstrom radius around the HDAC-2 
active site. 



active site. 



active site. 



GLY32 


HIS 33 


PRO 34 


ASP 104 


HIS 145 


HIS 146 


GLY 154 


PHE 155 


ASP 181 


HIS 183 


PHE210 


ASP 269 


' LEU 276 


TYR 308 




\mino Acids encompassed by a 7-Angstrom radius around t 


GLN31 


GLY 32 


HIS 33 I 


PRO 34 


MET 35 


GLU 103 I 


ASP 104 


.. CYS 105 


PRO 106 1 


GLY 143 


HIS 145 


HIS 146 


SER 153 


GLY 154 


PHE 155 


CYS 156 


ASP 179 


ASP 181 


ILE182 


HIS 183 


HIS 184 


TYR 209 


PHE 210 


GLN 265 


GLY 267 


ASP 269 


ASP 274 


ARG 275 


LEU 276 


GLY 305 1 


GLY 306 


GLY 307 


TYR 308 


\mino Acids encompassed by a 4- Angstrom radius around t 


TYR 29 


GLY 30 


GLN 31 | 


GLY 32 


HIS 33 


PRO 34 


MET 35 


LYS 36 


ARG 39 


ASN 100 


GLY 102 


GLU 103 


ASP 104 


CYS 105 


PRO 106 


GLY 142 


GLY 143 


LEU 144 


HIS 145 


HIS 146 


ALA 147 


LYS 148 


ALA 153 


SER 153 


GLY 154 


PHE 155 


CYS 156 


TYR 157 


ILE 161 


TYR 177 


ASP 179 


ILE 180 


ASP 181 


ILE 182 


HIS 183 


HIS 184 


GLY 185 


ASP 186 


GLY 187 


SER 202 


PHE 203 


HIS 204 


GLU 208 


TYR 209 


PHE 210 | 


PRO 211 


GLY 212 


GLN 265 S 


CYS 266 


GLY 267 


ALA 268 


ASP 269 


SER 270 


ASP 274 


ARG 275 


LEU 276 


GLY 277 


CYS 278 


GLY 304 


GLY 305 


GLY 306 


GLY 307 


TYR 308 


THR309 


VAL313 


TRP317 
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[0068] With the benefit of the crystal structure and guidance provided by Tables 2, 3, and 4, 
a wide variety of HDAC-2 variants (e.g., insertions, deletions, substitutions, etc.) that fall 
within the above specified identity ranges may be designed and manufactured utilizing 
recombinant DNA techniques well known to those skilled in the art, particularly in view of the 
knowledge of the crystal structure provided herein. These modifications can be used in a 
number of combinations to produce the variants. The present invention is useful for 
crystallizing and then solving the structure of the range of variants of HDAC-2. 
[0069] Variants of HDAC-2 may be insertional variants in which one or more amino acid 
residues are introduced into a predetermined site in the HDAC-2 sequence. For instance, 
insertional variants can be fusions of heterologous proteins or polypeptides to the amino or 
carboxyl terminus of the subunits. 

[0070] Variants of HDAC-2 also may be substitutional variants in which at least one residue 
has been removed and a different residue inserted in its place. Non-natural amino acids (i.e. 
amino acids not normally found in native proteins), as well as isosteric analogs (amino acid or 
otherwise) may optionally be employed in substitutional variants. Examples of suitable 
substitutions are well known in the art, such as Glu-»Asp, Ser— >Cys, Cys— >Ser, and His— ► Ala 
for example. 

[0071] Another class of variants is deletional variants, which are characterized by the 
removal of one or more amino acid residues from the HDAC-2 sequence. 
[0072] Other variants may be produced by chemically modifying amino acids of the native 
protein {e.g., diethylpyrocarbonate treatment that modifies histidine residues). Preferred are 
chemical modifications that are specific for certain amino acid side chains. Specificity may 
also be achieved by blocking other side chains with antibodies directed to the side chains to be 
protected. Chemical modification includes such reactions as oxidation, reduction, amidation, 
deamidation, or substitution of bulky groups such as polysaccharides or polyethylene glycol. 
[0073] Exemplary modifications include the modification of lysinyl and amino terminal 
residues by reaction with succinic or other carboxylic acid anhydrides. Modification with these 
agents has the effect of reversing the charge of the lysinyl residues. Other suitable reagents for 
modifying amino-containing residues include imidoesters such as methyl picolinimidate; 
pyridoxal phosphate; pyridoxal chloroborohydride; trinitrobenzenesulfonic acid; 0- 
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methylisourea; 2,4-pentanedione; and transaminase catalyzed reaction with glyoxylate; and N- 
hydroxysuccinamide esters of polyethylene glycol or other bulky substitutions. 
[0074] Arginyl residues may be modified by reaction with a number of reagents, including 
phenylglyoxal; 2,3-butanedione; 1,2-cyclohexanedione; and ninhydrin. Modification of 
arginine residues requires that the reaction be performed in alkaline conditions because of the 
high pK^ of the guanidine functional group. Furthermore, these reagents may react with the 
groups of lysine as well as the arginine epsilon-amino group. 

[0075] Tyrosyl residues may also be modified to introduce spectral labels into tyrosyl 
residues by reaction with aromatic diazonium compounds or tetranitromethane, forming 0- 
acetyl tyrosyl species and 3-nitro derivatives, respectively. Tyrosyl residues may also be 
iodinated using 125 I or 131 I to prepare labeled proteins for use in radioimmunoassays. 
[0076] Carboxyl side groups (aspartyl or glutamyl) may be selectively modified by reaction 
with carbodiimides or they may be converted to asparaginyl and glutaminyl residues by 
reaction with ammonium ions. Conversely, asparaginyl and glutaminyl residues may be 
deamidated to the corresponding aspartyl or glutamyl residues, respectively, under mildly 
acidic conditions. Either form of these residues falls within the scope of this invention. 
[0077] Other modifications that may be formed include the hydroxylation of proline and 
lysine, phosphorylation of hydroxyl groups of seryl or threonyl groups of lysine, arginine and 
histidine side chains (T. E. Creighton, Proteins: Structure and Molecular Properties, W.H. 
Freeman & Co., San Francisco, pp. 79-86, 1983), acetylation of the N-terminal amine and 
amidation of any C-terminal carboxyl group. 

[0078] As can be seen, modifications of the nucleic sequence encoding HDAC-2 may be 
accomplished by a variety of well-known techniques, such as site-directed mutagenesis (see, 
Gillman and Smith, Gene 8:81-97 (1979) and Roberts, S. et a/., Nature 328:731-734 (1987)). 
When modifications are made, these modifications may optionally be evaluated for there affect 
on a variety of different properties including, for example, solubility, crystallizability and a 
modification to the protein's structure and activity. 

[0079] In one variation, the variant and/or fragment of wild-type HDAC-2 is functional in 
the sense that the resulting protein is capable of associating with at least one same chemical 
entity that is also capable of selectively associating with a protein comprising SEQ. ID No. 1 
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since this common associative ability evidences that at least a portion of the native structure has 
been conserved. That chemical entity may optionally be TS A. 

[0080] It is noted that the activity of the native protein need not necessarily be conserved. 
Rather, amino acid substitutions, additions or deletions that interfere with native activity but 
which do not significantly alter the three-dimensional structure of the domain are specifically 
contemplated by the invention. Crystals comprising such variants of HDAC-2, and the atomic 
structure coordinates obtained therefrom, can be used to identify compounds that bind to the 
native domain. These compounds may affect the activity or the native domain. 
[0081] Amino acid substitutions, deletions and additions that do not significantly interfere 
with the three-dimensional structure of HDAC-2 will depend, in part, on the region where the 
substitution, addition or deletion occurs in the crystal structure. These modifications to the 
protein can now be made far more intelligently with the crystal structure information provided 
herein. In highly variable regions of the molecule, non-conservative substitutions as well as 
conservative substitutions may be tolerated without significantly disrupting the three- 
dimensional structure of the molecule. In highly conserved regions, or regions containing 
significant secondary structure, conservative amino acid substitutions are preferred. 
[0082] Conservative amino acid substitutions are well known in the art, and include 
substitutions made on the basis of similarity in polarity, charge, solubility, hydrophobicity, 
hydrophilicity and/or the amphipathic nature of the amino acid residues involved. For 
example, negatively charged amino acids include aspartic acid and glutamic acid; positively 
charged amino acids include lysine and arginine; amino acids with uncharged polar head 
groups having similar hydrophilicity values include the following: leucine; isoleucine; valine; 
glycine; alanine; asparagines; glutamine; serine; threonine; phenylalanine; and tyrosine. Other 
conservative amino acid substitutions are well known in the art. 

[0083] It should be understood that the protein may be produced in whole or in part by 
chemical synthesis. As a result, the selection of amino acids available for substitution or 
addition is not limited to the genetically encoded amino acids. Indeed, mutants may optionally 
contain non-genetically encoded amino acids. Conservative amino acid substitutions for many 
of the commonly known non-genetically encoded amino acids are well known in the art. 
Conservative substitutions for other amino acids can be determined based on their physical 
properties as compared to the properties of the genetically encoded amino acids. 
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[0084] In some instances, it may be particularly advantageous or convenient to substitute, 
delete and/or add amino acid residues in order to provide convenient cloning sites in cDNA 
encoding the polypeptide, to aid in purification of the polypeptide, etc. Such substitutions, 
deletions and/or additions which do not substantially alter the three dimensional structure of 
HDAC-2 will be apparent to those having skills in the art, particularly in view of the three 
dimensional structure of HDAC-2 provided herein. 

2. Cloning, Expression and Purification of HDAC-2 
[0085] The gene encoding HDAC-2 can be isolated from RNA, cDNA or cDNA libraries. 
In this case, the portion of the gene encoding amino acid residues 1-488 was isolated and is 
shown as SEQ. I.D. No. 2. 

[0086] Construction of expression vectors and recombinant proteins from the DNA 
sequence encoding HDAC-2 may be performed by various methods well known in the art. For 
example, these techniques may be performed according to Sambrook et al., Molecular Cloning- 
A Laboratory Manual, Cold Spring Harbor, N.Y. (1989), and Kriegler, M., Gene Transfer and 
Expression, A Laboratory Manual, Stockton Press, New York (1990). 

[0087] A variety of expression systems and hosts may be used for the expression of HDAC- 
2. Example 1 provides one such expression system. 

[0088] Once expressed, purification steps are employed to produce HDAC-2 in a relatively 
homogeneous state. In general, a higher purity solution of a protein increases the likelihood 
that the protein will crystallize. Typical purification methods include the use of centrifugation, 
partial fractionation, using salt or organic compounds, dialysis, conventional column 
chromatography, (such as ion exchange, molecular sizing chromatography, etc.), high 
performance liquid chromatography (HPLC), and gel electrophoresis methods (see, e.g., 
Deutcher, "Guide to Protein Purification" in Methods in Enzymology (1990), Academic Press, 
Berkeley, California). 

[0089] HDAC-2 may optionally be affinity labeled during cloning, preferably with a poly- 
histidine (His 6 ) region, in order to facilitate purification. With the use of an affinity label, it is 
possible to perform a one-step purification process on a purification column that has a unique 
affinity for the label. The affinity label may be optionally removed after purification. These 
and other purification methods are known and will be apparent to one of skill in the art. 
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3. Crystallization & Crystals Comprising HDAC-2 
[0090] One aspect of the present invention relates to methods for forming crystals 
comprising HDAC-2 as well as crystals comprising HDAC-2. 

[0091] In one embodiment, a method for forming crystals comprising HDAC-2 is provided 
comprising forming a crystallization volume comprising HDAC-2, one or more precipitants, 
optionally a buffer, optionally a monovalent or divalent salt and optionally an organic solvent; 
and storing the crystallization volume under conditions suitable for crystal formation. 
[0092] In yet another embodiment, a method for forming crystals comprising HDAC-2 is 
provided comprising forming a crystallization volume comprising HDAC-2 in solution 
comprising the components shown in Table 5; and storing the crystallization volume under 
conditions suitable for crystal formation. 

Table 5 

Precipitant 

5-50% w/v comprising one or more of any of the PEGs from the 200-20000 
molecular weight range, 2-methyl-2,4-pentanediol (MPD) or isopropanol 

pH 

pH 4-10. Buffers that may be used include, but are not limited to imidazole, 
acetate, hepes, citrate, tris, CHES, MES and combinations thereof. 

Additives 

0.01 mM-3 M comprising one or more of any monovalent or divalent cation, 
including, but not limited to, Ca 2+ , Zn 2+ , Mg , Mn 2+ , Na + or K + , and/or 
ammonium sulfate. 

Protein Concentration 

1 mg/ml - 50 mg/ml 

Temperature 

1°C-25°C 

[0093] In yet another embodiment, a method for forming crystals comprising HDAC-2 is 
provided comprising forming a crystallization volume comprising HDAC-2; introducing 
crystals comprising HDAC-2 as nucleation sites, and storing the crystallization volume under 
conditions suitable for crystal formation. 

[0094] Crystallization experiments may optionally be performed in volumes commonly used 
in the art, for example typically 15, 10, 5, 2 microliters or less. It is noted that the 
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crystallization volume optionally has a volume of less than 1 microliter, optionally 500, 250, 
150, 100, 50 or less nanoliters. 

[0095] It is also noted that crystallization may be performed by any crystallization method 
including, but not limited to batch, dialysis and vapor diffusion (e.g., sitting drop and hanging 
drop) methods. Micro and/or macro seeding of crystals may also be performed to facilitate 
crystallization. 

[0096] It should be understood that forming crystals comprising HDAC-2 and crystals 
comprising HDAC-2 according to the invention are not intended to be limited to the wild-type, 
full length HDAC-2 shown in SEQ. ID No. 1. Rather, it should be recognized that the 
invention may be extended to various other fragments and variants of wild-type HDAC-2 as 
described above. 

[0097] It should also be understood that forming crystals comprising HDAC-2 and crystals 
comprising HDAC-2 according to the invention may be such that HDAC-2 is complexed with 
one or more ligands and one or more copies of the same ligand. The ligand used to form the 
complex may be any ligand capable of binding to HDAC-2. In one variation, the ligand is a 
natural substrate. In another variation, the ligand is an inhibitor, such as trichostatin A (TSA). 
In one particular variation, the ligand binds to the binding pocket of the protein. Examples of 
such ligands include, but are not limited to, small molecule inhibitors of HDAC-2 such as 
trichostatin A (TSA). In one particular variation, the crystallizable compositions of this 
invention comprise one or more copies of TSA as the substrate. 

[0098] Optionally, the HDAC-2 complex may further comprise divalent cations, especially 
zinc which may be introduced in any suitable manner. For example, the cations may be 
introduced by incubating the desired divalent cation with a suitable metal salt such as MgCl 2 
prior to incubation with the HDAC-2 protein. 

[0099] In one particular embodiment, HDAC-2 crystals have a crystal lattice in the P2i 
space group. HDAC-2 crystals may also optionally have unit cell dimensions, +/- 5%, of 
a=79.9A, b=56.9A, c=95.2A, a=90°, 0=90.5°, and y=90°. 

[0100] In one particular embodiment, HDAC-2 crystals have a crystal lattice in a 

space group. HDAC-2 crystals may also optionally have unit cell dimensions, +/- 5%, of 

a=92.lA, b= 97.6A, c=138.9A, and a=p=y=90°. 
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[0101] HDAC-2 crystals also preferably are capable of diffracting X-rays for determination 
of atomic coordinates to a resolution of 4 A, 3 A, 2.5 A, 2 A or greater. 

[0102] Crystals comprising HDAC-2 may be formed by a variety of different methods 
known in the art. For example, crystallizations may be performed by batch, dialysis, and vapor 
diffusion (sitting drop and hanging drop) methods. A detailed description of basic protein 
crystallization setups may be found in McRee, D. and David. P., Practical Protein 
Crystallography , 2 nd Ed. (1999), Academic Press Inc. Further descriptions regarding 
performing crystallization experiments are provided in Stevens, et al. (2000) Curr. Opin. 
Struct Biol: 10(5):558-63, and U.S. Patent Nos. 6,296,673, 5,419,278, and 5,096, 676. 
[0103] In one variation, crystals comprising HDAC-2 are formed by mixing substantially 
pure HDAC-2 with an aqueous buffer containing a precipitant at a concentration just below a 
concentration necessary to precipitate the protein. One suitable precipitant for crystallizing 
HDAC-2 is polyethylene glycol (PEG), which combines some of the characteristics of the salts 
and other organic precipitants (see, for example, Ward et al., 7. Mol Biol 98:161, 1975, and 
McPherson, 7. Biol Chem. 251:6300, 1976). 

[0104] During a crystallization experiment, water is removed by diffusion or evaporation to 
increase the concentration of the precipitant, thus creating precipitating conditions for the 
protein. In one particular variation, crystals are grown by vapor diffusion in hanging drops or 
sitting drops. According to these methods, a protein/precipitant solution is formed and then 
allowed to equilibrate in a closed container with a larger aqueous reservoir having a precipitant 
concentration for producing crystals. The protein/precipitant solution continues to equilibrate 
until crystals grow. 

[0105] By performing submicroliter volume sized crystallization experiments, as detailed in 
U.S. Patent No. 6,296,673, effective crystallization conditions for forming crystals of HDAC-2 
complexed with a range of compounds were obtained. In order to accomplish this, systematic 
broad screen crystallization trials were performed on HDAC-2 using the sitting drop technique. 
In each experiment, a lOOnL mixture of HDAC-2 complexed with different inhibitors and 
precipitant was placed on a platform positioned over a well containing 50-100^L of the 
precipitating solution. Precipitate and crystal formation was detected in the sitting drops. Fine 
screening was then carried out for those crystallization conditions that appeared to produce 
precipitate and/or crystal in the drops. 
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[0106] Based on the crystallization experiments that were performed, a thorough 
understanding of how different crystallization conditions affect HDAC-2 crystallization was 
obtained. Based on this understanding, a series of crystallization conditions were identified 
that may be used to form crystals comprising HDAC-2. These conditions are summarized in 
Table 5. Particular examples of crystallization conditions that may be used to form diffraction 
quality crystals of HDAC-2 are detailed in Examples 1-2. Figures 2A-2B illustrate crystals of 
HDAC-2 complexes formed using the crystallization conditions provided in Table 5. 
[0107] One skilled in the art will recognize that the crystallization conditions provided in 
Table 5 and Examples 1-2 can be varied and still yield protein crystals comprising HDAC-2. 
For example, it is noted that variations on the crystallization conditions described herein can be 
readily determined by taking the conditions provided in Table 5 and performing fine screens 
around those conditions by varying the type and concentration of the components in order to 
determine additional suitable conditions for crystallizing HDAC-2, variants of HDAC-2, and 
ligand complexes thereof. 

[0108] Crystals comprising HDAC-2 have a wide range of uses. For example, now that 
crystals comprising HDAC-2 have been produced, it is noted that crystallizations may be 
performed using such crystals as a nucleation site within a concentrated protein solution. 
According to this variation, a concentrated protein solution is prepared and a crystalline 
material (microcrystals) is used to 'seed' the protein solution to assist nucleation for crystal 
growth. If the concentrations of the protein and any precipitants are optimal for crystal growth, 
the seed crystal will provide a nucleation site around which a larger crystal forms. Given the 
ability to form crystals comprising HDAC-2 according to the present invention, the crystals so 
formed can be used by this crystallization technique to initiate crystal growth of other HDAC-2 
comprising crystals, including HDAC-2 complexed to other ligands. 

[0109] As will be described herein in greater detail, crystals may also be used to perform X- 
ray or neutron diffraction analysis in order to determine the three-dimensional structure of 
HDAC-2 and, in particular, to assist in the identification of its active site. Knowledge of the 
binding site region allows rational design and construction of ligands including inhibitors. 
Crystallization and structural determination of HDAC-2 mutants having altered bioactivity 
allows the evaluation of whether such changes are caused by general structure deformation or 
by side chain alterations at the substitution site. 
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4. X-Ray Data Collection and Structure Determination 
[0110] Crystals comprising HDAC-2 may be obtained as described above in Section 3. As 
described herein, these crystals may then be used to perform x-ray data collection and for 
structure determination. 

[0111] In one embodiment, described in Example 1, crystals of an HDAC-2-Zn 2+ -SAHA 
complex were obtained where HDAC-2 has the sequence of residues shown in SEQ. ID No. 4. 
These particular crystals were used to determine the three dimensional structure of HDAC-2. 
However, it is noted that other crystals comprising HDAC-2 including different HDAC-2 
variants, fragments, and complexes thereof may also be used. 

[0112] Diffraction data was collected from cryocooled crystals (100K) of the HDAC-2- 

2+ 

Zn -SAHA complex at the Advanced Light Source beam line 5.0.2 (Berkeley, CA) using an 
ADSC CCD detector. The diffraction pattern of the HDAC-2-Zn 2+ -SAHA complex displayed 
symmetry consistent with space group P2i with unit cell dimensions a=79.9A b= 56.9A and 
c=95.2A a=90° (3=90.5° y=90°. Data were collected and integrated to 1.8A with DENZO and 
scaled with SCALEPACK (Z. Ostwinowski and W. Minor "Processing of X-ray Diffraction 
Data Collected in Oscillation Mode", Methods in Enzymology, Volume 276: Macromolecular 
Crystallography, Part A, pages 307-326, 1997, C. W. Carter, Jr. & R. M. Sweet, Eds. Academic 
Press.). 

[0113] All crystallographic calculations were performed using the CCP4 program package 
(Collaborative Computational Project, N. The CCP4 Suite: Programs for Protein 
Crystallography. Acta Cryst. D50, 760-763 (1994)). The initial phases for HDAC-2-Zn 2+ - 
SAHA complex were obtained by the molecular replacement method using the program 
MOLREP. The coordinates of histone deacetylase HDAC-8 previously determined and 
disclosed in U.S. Application Serial Nos. 10/601,335 and 10/601,058, filed on June 20, 2003, 
were used as a search model for the solution of the HDAC-2-Zn 2+ -SAHA structure. The 
highest solution from the translation function was subjected to a rigid body refinement against 
the maximum likelihood target function as implemented in REFMAC (CCP4). Rigid body 
refinement was followed by 50 cycles of iterative map/model/phase improvement using 
ARP_WARP map improvement (Perrakis, A., Morris, R.J. & Lamzin, V.S. This was followed 
by alternating cycles of manual rebuilding of the model with Xfit (McRee, D.E. XtalView/Xfit- 
A versatile program for manipulating atomic coordinates and electron density J. Struct Biol 
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125, 156-65 (1999)), ARP_WARP map improvement (Perrakis, A., Morris, RJ. & Lamzin, 
V.S. Automated protein model building combined with iterative structure refinement) and 
geometrically restrained refinement against a maximum likelihood target function as 
implemented in REFMAC (CCP4) until the refinement reached convergence. All stages of 
model refinement were carried with bulk solvent correction and anisotropic scaling. The data 
collection and data refinement statistics are given in Table 6A. 



TABLE 6A 



Crystal data 




Ligands 


SAHA, water, Zn , 
Na 2+ ,S0 4 2+ 


Space group 


P2, 


Unit cell dimensions 


a=79.9A b= 56.9A 
and c=95.2A a=90° 
p=90.5°y=90° 






Data collection 


HDAC-2-Zn^- 
SAHA 


X-ray source 


ALS 5.0.2 


Wavelength [A] 


1.0 


Resolution [A] 


50-1.8 


Observations (unique) 


256208 (76785) 


Redundancy 


3 


Completeness overall (outer shell) 


96.7% (72.4%) 


I/a(I) overall (outer shell) 


11.4(1.8) 


Rsvmm 1 overall (outer shell) 


0.108(0.39) 






Refinement 




Reflections used 


72932 


R-factor 


19.1% 


Rfree 


22.1% 


r.m.s bonds 


0.008 


r.m.s angles 


1.06 | 


1 Rsymm = Ehu^i | I(hkl) r <I(hkl)> | / E^S* <I(hkl)i> over I observations 
of a reflection hkl 



[0114] During structure determination, where the unit cell dimensions were a=79.9A, 
b=56.9A, c=95.2A, and <x=90° P=90.5° y=90°, it was realized that the asymmetric unit 
comprised two HDAC-2-Zn 2+ -SAHA molecules. 

[0115] In one embodiment, described in Example 2, crystals of an HDAC-2-Zn 2+ -TSA 
complex were obtained where HDAC-2 has the sequence of residues shown in SEQ. ID No. 5. 
These particular crystals were used to determine the three dimensional structure of HDAC-2. 
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However, it is noted that other crystals comprising HDAC-2 including different HDAC-2 
variants, fragments, and complexes thereof may also be used. 

[0116] Diffraction data was collected from cryocooled crystals (100K) of the HDAC-2- 
Zn -TSA complex at the Advanced Light Source beam line 5.0.3 (Berkeley, CA) using an 
ADSC CCD detector. The diffraction pattern of the HDAC-2-Zn 2+ -TSA complex displayed 
symmetry consistent with space group P2i2i2i with unit cell dimensions a=92.1 b= 97.6A and 
c=138.9A a=p=y=90°. Data were collected and integrated to 1.85A with DENZO and scaled 
with SCALEPACK (Z. Ostwinowski and W. Minor "Processing of X-ray Diffraction Data 
Collected in Oscillation Mode", Methods in Enzymology, Volume 276: Macromolecular 
Crystallography, Part A, pages 307-326, 1997, C. W. Carter, Jr. & R. M. Sweet, Eds. Academic 
Press.). 

[0117] All crystallographic calculations were performed using the CCP4 program package 
(Collaborative Computational Project, N. The CCP4 Suite: Programs for Protein 
Crystallography. Acta Cryst. D50, 760-763 (1994)). The initial phases for HDAC-2-Zn 2+ -TSA 
complex were obtained by the molecular replacement method using the program MOLREP. 
The coordinates of histone deacetylase HDAC-8 previously determined and disclosed in U.S. 
Application Serial Nos. 10/601,335 and 10/601,058, filed on June 20, 2003, were used as a 
search model for the solution of the HDAC-2-Zn 2+ -TSA structure. The highest solution from 
the translation function was subjected to a rigid body refinement against the maximum 
likelihood target function as implemented in REFMAC (CCP4). Rigid body refinement was 
followed by 50 cycles of iterative map/model/phase improvement using ARP_WARP map 
improvement (Perrakis, A., Morris, RJ. & Lamzin, V.S. T his was followed by alternating 
cycles of manual rebuilding of the model with Xfit (McRee, D.E. XtalView/Xfit-A versatile 
program for manipulating atomic coordinates and electron density 7. Struct. Biol 125, 156-65 
(1999)), ARP_WARP map improvement (Perrakis, A., Morris, R.J. & Lamzin, V.S. Automated 
protein model building combined with iterative structure refinement) and geometrically 
restrained refinement against a maximum likelihood target function as implemented in 
REFMAC (CCP4) until the refinement reached convergence. All stages of model refinement 
were carried with bulk solvent correction and anisotropic scaling. The data collection and data 
refinement statistics are given in Table 6B. 
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Crystal data 




Ligands 


TSA, water, Zn i+ , 
Na 2+ , i 


Space group 


P2,2,2, 


Unit cell dimensions 


a=92.1 b= 97.6A and 
c=138.9A a=3=y=90° 






Data collection 


HDAC-2-Zn z+ -TSA 


vv-id.y source 


ALo J.U.J 


wdveiengin \_r\\ 


1 .u 


Resolution [A] 


50-1.84 | 


Observations (unique) 


361033(107684) \ 


Redundancy 


3 ! 


Completeness overall (outer shell) 


99.6% (99%) 


I/a(I) overall (outer shell) 


15 (2) ; 


R symm overall (outer shell) 


0.069 (0.58) i 






Refinement 




Reflections used 


102236 ! 


R-factor 


18.66% ; 


Rfree 


21.79% 


r.m.s bonds 


0.007 i 


r.m.s angles 


1.01 


1 Rsymm = EhkA | I(hkl) r <I(hkl)> | / I hkl I i <I(hkl) i > over I observations 
of a reflection hJkl 



[0118] During structure determination, where the unit cell dimensions were a=92.lA, b= 
97.6 A, c=138.9A, and a=p=y=90°, it was realized that the asymmetric unit comprised three 
HDAC-2-Zn -TSA molecules. Structure coordinates were determined for this complex and 
the resultant set of structural coordinates from the refinement are presented in Figure 3. 
[0119] The binding pocket, shown in Figure 5, was observed in this structure with one TSA 
molecule bound in the pocket. TSA binds with its hydroxamate moiety ligating the zinc ion 
bound at the bottom of the pocket. Key interactions between groups in the binding pockets and 
the TSA molecules are depicted in and described in Figure 5. 

[0120] It is noted that the sequence of the structure coordinates presented in Figures 3 differ 
in some regards from the sequence shown in SEQ. ID No. 5. It is noted structure coordinates 
are not reported for some residues because the electron density obtained was insufficient to 
identify the position of these residues. For Figure 3, chain A, structure coordinates for residues 
1-12 and 379-409 (using numbering from SEQ. No. 5) are not reported. For Figure 3, chain B, 
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structure coordinates for residues 1-13 and 379-409 are not reported. For Figure 3, chain C, 
structure coordinates for residues 1-13 and 379-409 are not reported. 

[0121] Those of skill in the art understand that a set of structure coordinates (such as those 
in Figure 3) for a protein or a protein-complex or a portion thereof, is a relative set of points 
that define a shape in three dimensions. Thus, it is possible that an entirely different set of 
structure coordinates could define a similar or identical shape. Moreover, slight variations in 
the individual coordinates may have little effect on overall shape. In terms of binding pockets, 
these variations would not be expected to significantly alter the nature of ligands that could 
associate with those pockets. The term "binding pocket" as used herein refers to a region of the 
protein that, as a result of its shape, favorably associates with a ligand. 

[0122] These variations in coordinates may be generated because of mathematical 
manipulations of the HDAC-2 structure coordinates. For example, the sets of structure 
coordinates shown in Figure 3 could be manipulated by crystallographic permutations of the 
structure coordinates, fractionalization of the structure coordinates, application of a rotation 
matrix, integer additions or subtractions to sets of the structure coordinates, inversion of the 
structure coordinates or any combination of the above. 

[0123] Alternatively, modifications in the crystal structure due to mutations, additions, 
substitutions, and/or deletions of amino acids or other changes in any of the components that 
make up the crystal could also account for variations in structure coordinates. If such 
variations are within an acceptable standard error as compared to the original coordinates, the 
resulting three-dimensional shape should be considered to be the same. Thus, for example, a 
ligand that bound to the active site binding pocket of HDAC-2 would also be expected to bind 
to another binding pocket whose structure coordinates defined a shape that fell within the 
acceptable error. 

[0124] Various computational methods may be used to determine whether a particular 
protein or a portion thereof (referred to here as the "target protein"), typically the binding 
pocket, has a high degree of three-dimensional spatial similarity to another protein (referred to 
here as the "reference protein") against which the target protein is being compared. 
[0125] The process of comparing a target protein structure to a reference protein structure 
may generally be divided into three steps: 1) defining the equivalent residues and/or atoms for 
the target and reference proteins, 2) performing a fitting operation between the proteins; and 3) 
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analyzing the results. These steps are described in more detail below. All structure 
comparisons reported herein and the structure comparisons claimed are intended to be based on 
the particular comparison procedure described below. 

[0126] Equivalent residues or atoms can be determined based upon an alignment of primary 
sequences of the proteins, an alignment of their structural domains or as a combination of both. 
Sequence alignments generally implement the dynamic programming algorithm of Needleman 
and Wunsch [J. Mol Biol. 48: 442-453, 1970]. For the purpose of this invention the sequence 
alignment was performed using the publicly available software program MOE (Chemical 
Computing Group Inc.) package version 2002.3. When using the MOE program, alignment 
was performed in the sequence editor window using the ALIGN option utilizing the following 
program parameters: Initial pairwise Build-up: ON, Substitution Matrix: Blosum62, Round 
Robin: ON, Gap Start: 7, Gap Extend: 1, Iterative Refinement: ON, Build-up: TREE-BASED, 
Secondary Structure: NONE, Structural Alignment: ENABLED, Gap Start: 1, Gap Extend: 0.1. 
[0127] Once aligned, a rigid body fitting operation is performed where the structure for the 
target protein is translated and rotated to obtain an optimum fit relative to the structure of the 
reference protein. The fitting operation uses an algorithm that computes the optimum 
translation and rotation to be applied to the moving structure, such that the root mean square 
deviation of the fit over the specified pairs of equivalent atoms is an absolute minimum. For 
the purpose of fitting operations made herein, the publicly available software program MOE 
(Chemical Computing Group Inc.) v. 2002.3 was used. 

[0128] The results from this process are typically reported as an RMSD value between two 
sets of atoms. The term "root mean square deviation" means the square root of the arithmetic 
mean of the squares of deviations. It is a way to express the deviation or variation from a trend 
or object. As used herein, an RMSD value refers to a calculated value based on variations in 
the atomic coordinates of a target protein from the atomic coordinates of a reference protein or 
portions of thereof. The structure coordinates for HDAC-2, provided in Figure 3, are used as 
the reference protein in these calculations. 

[0129] The same set of atoms was used for initial fitting of the structures and for computing 
root mean square deviation values. For example, if a root mean square deviation (RMSD) 
between Coc atoms of two proteins is needed, the proteins in question should be superposed 
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only on the Coc atoms and not on any other set of atoms. Similarly, if an RMSD calculation for 
all atoms is required, the superposition of two structures should be performed on all atoms. 
[0130] Based on a review of protein structures deposited in the Protein Databank (PDB), 
1C3R was identified as having the smallest RMSD values relative to the structure coordinates 
provided herein. Table 7 below provides a series of RMSD values that were calculated by the 
above described process using the structure coordinates in Figure 3 as the reference protein and 
the structure coordinates from PDB code: 1C3R (Histone deacetylase like protein, HDLP) as 
the target protein. 



TABLE 7 



AA RESIDUES USED TO 
PERFORM RMSD 
COMPARISON WITH 
| PDB:1C3R 


PORTION OF EACH AA RESIDUE 

USED TO PERFORM RMSD 
COMPARISON WITH PDB:1C3R 


RMSD 

[A] 


Table 2 
(4 Angstrom set) 


alpha-carbon atoms 1 


0.67 


main-chain atoms 1 


0.71 


all non-hydrogen 2 ^ 


0.99 


Table 3 
(7 Angstrom set) 


alpha-carbon atoms 1 


0.78 


main-chain atoms 1 


0.89 


all non-hydrogen 2 


1.20 


1 "fable 4 

(10 Angstrom set) 


alpha-carbon atoms 1 


0.91 


main-chain atoms^ 


1.02 


all non-hydrogen^ 


1.25 


SEQ. ID No. 3 


alpha-carbon atoms 1 


2.25 


main-chain atoms^ 


2.21 


all non-hydrogen 2 


2.47 



- the RMSD computed between the atoms of all amino acids that are common to both the target and the 
reference in the aligned and superposed structure. The amino acids need not be identical. 

2 - the RMSD computed only between identical amino acids, which are common to both the target and the 
reference in the aligned and superposed structure. 

[0131] It is noted that mutants and variants of HDAC-2 as well as other histone deacetylases 
are likely to have similar structures despite having different sequences. For example, the 
binding pockets of these related proteins are likely to have similar contours. Accordingly, it 
should be recognized that the structure coordinates and binding pocket models provided herein 
have utility for these other related proteins. 

[0132] Accordingly, in one embodiment, the invention relates to data, computer readable 
media comprising data, and uses of the data where the data comprises all or a portion of the 
structure coordinates shown in Figure 3 or structure coordinates having a root mean square 
deviation (RMSD) equal to or less than the RMSD value specified in Columns 3, 4 or 5 of 
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Table 1 when compared to the structure coordinates of Figure 3, the root mean square deviation 
being calculated such that the portion of amino acid residues specified in Column 2 of Table 1 
of each set of structure coordinates are superimposed and the root mean square deviation is 
based only on those amino acid residues in the structure coordinates that are also present in the 
portion of the protein specified in Column 1 of Table 1. 

[0133] As noted, there are many different ways to express the surface contours of the 
HDAC-2 structure other than by using the structure coordinates provided in Figure 3. 
Accordingly, it is noted that the present invention is also directed to any data, computer 
readable media comprising data, and uses of the data where the data defines a computer model 
for a protein binding pocket, at least a portion of the computer model having a surface contour 
that has a root mean square deviation equal to or less than a given RMSD value specified in 
Columns 3, 4 or 5 of Table 1 when the coordinates used to compute the surface contour are 
compared to the structure coordinates of Figure 3, wherein (a) the root mean square deviation is 
calculated by the calculation method set forth herein, (b) the portion of amino acid residues 
associated with the given RMSD value in Table 1 (specified in Column 2 of Table 1) are 
superimposed according to the RMSD calculation, and (c) the root mean square deviation is 
calculated based only on those amino acid residues present in both the protein being modeled 
and the portion of the protein associated with the given RMSD in Table 1 (specified in Column 
1 of Table 1). 

5. HDAC-2-Zn 2 +-TSA Structure 
[0134] The present invention is also directed to a three-dimensional crystal structure of 
HDAC-2. This crystal structure may be used to identify binding sites, to provide mutants 
having desirable binding properties, and ultimately, to design, characterize, or identify ligands 
that interact with HDAC-2. 

[0135] The three-dimensional crystal structure of HDAC-2 may be generated, as is known 
in the art, from the structure coordinates shown in Figure 3 and similar such coordinates. 
[0136] The refined crystal structure of HDAC-2-Zn 2+ -TSA determined according to the 
present invention contains amino acids residues 13-378 as numbered according to SEQ. ID No. 
5 (based on the coordinates of Figure 3), one bound TS A molecule, and one Zn + ion. A total of 
359 water molecules were included. 
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[0137] Figure 4A, and Figure 4B, a 90° rotation thereof, illustrate a ribbon diagram 
overview of the structure of HDAC-2, highlighting the secondary structural elements of the 
protein. HDAC-2 adopts an open-faced a/p structure consisting of 8 central parallel (5-sheets 
sandwiched between 12 oc-helices. The ligand binding cleft lies almost in the plane of the 
central (3-sheet, and is formed primarily by loops emanating from the carboxy-terminal ends of 
the p-strands comprising the sheet. Residues which form loop regions extending between p- 
strand 1 and a-helix 1 and between oc-helix 4 and a-helix 5, provide key surface interactions 
with bound ligands. Residues which form loop regions extending between P-strand 3 and oc- 
helix 6 and between p-strand 4 and a-helix 7 and between p-strand 8 and a-helix 10 play 
important roles in defining the shape of the ligand binding pocket, and are involved in a 
number of key interactions with the bound ligands. 

[0138] The only other protein that possesses a high degree of structural and topological 
homology to HDAC-2 is histone deacetylase from A. Aeolicus (HDLP, pdb entry 1C3P). The 
structure of this protein is reported in PCT Publication No. WO 01/18045. Based on data from 
this publication, the two structures can be superimposed with a Ca RMSD of 2.25 A. 
However, HDAC-2 only possesses 12 a-helices while HDLP contains 16. Also, the HDAC-2 
structure differs significantly from HDLP in a number of key aspects, particularly in the 
relative spatial dispositions of the active site loops, the structural landscape in the vicinity of 
the active site. Figure 5 illustrates a representation of the active site of HDAC-2 and the 
relative orientations of the bound TSA molecule based on the structure coordinates shown in 
Figure 3. The catalytic machinery residing at the bottom of the active site pocket is identical to 
that observed in HDLP. A 7A x 5A "foot pocket" extends into the protein interior from the 
bottom of the catalytic pocket in a perpendicular direction to the active site pocket. The central 
region of the active site pocket is lined with a hydrophobic band of residues, including F155, 
F210 and L276. The active site pocket of HDAC-2 is narrower than its counterpart in HDLP, 
and the residues lining the pocket are likely more conformationally restricted. 

6. HDAC-2 Binding Pocket and Ligand Interaction 
[0139] The terms "binding site" or "binding pocket", as used herein, refer to a region of a 
protein that, as a result of its shape, favorably associates with a ligand or substrate. The term 
"HDAC-2-like binding pocket" refers to a portion of a molecule or molecular complex whose 
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shape is sufficiently similar to the HDAC-2 binding pockets as to bind common ligands. This 
commonality of shape may be quantitatively defined by a root mean square deviation (RMSD) 
from the structure coordinates of the backbone atoms of the amino acids that make up the 
binding pockets in HDAC-2 (as set forth in Figure 3). 

[0140] The "active site binding pocket" (Fig. 5) or "active site" of HDAC-2 refers to the 
area on the surface of HDAC-2 where the substrate (a TS A inhibitor molecule) binds. Figure 5 
illustrates TS A bound in the active site of HDAC-2 based on the crystal structure of the present 
invention. TSA binds with its hydroxamate moiety ligating the zinc ion bound at the bottom of 
the pocket. Key interactions between groups in the binding pockets and the TSA molecule are 
depicted in and described in Figure 5. 

[0141] To date, the active site binding pocket of histone deacetylases (based on the structure 
of HDLP) has been the only target for the design of small molecule inhibitors. A number of 
key substrate binding and catalytic residues observed in the active site binding pocket of 
HDAC-2 appear well conserved among all class I histone deacetylases. However, when the 
overall sequence of the TSA binding pocket in the HDAC-2-Zn 2+ -TSA complex are compared 
with the aligned sequences of other HDACs, significant sequence variability is observed, which 
is reflective of diversity among members of the HDAC family. The binding pocket likely 
shows subtle differences in shape and chemical content that may be explored to confer 
specificity of inhibition. 

[0142] In resolving the crystal structure of HDAC-2 in complex with Zn 2+ -TSA, applicants 
determined that HDAC-2 amino acids in Tables 2 (above) are within 4 Angstroms of and 
therefore close enough to interact with the two TSA molecules. Applicants have also 
determined that the amino acids of Tables 3 (above) are within 7 Angstroms of bound TSA 
molecules and therefore are also close enough to interact with that inhibitor or analogs thereof. 
Further it has been determined that the amino acids of Tables 4 (above) are within 10 
Angstroms of bound TSA molecules and therefore are also close enough to interact with that 
inhibitor or analogs thereof. The 4, 7, and/or 10 Angstroms sets of amino acids are preferably 
conserved in variants of HDAC-2. While it is desirable to largely conserve these residues, it 
should be recognized however that variants may also involve varying 1, 2, 3, 4 or more of the 
residues set forth in Tables 2, 3, and 4 in order to evaluate the roles these amino acids play in 
the binding pocket. 
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[0143] With the knowledge of the HDAC-2 crystal structure provided herein, Applicants 
define the HDAC-2 binding pocket as binding pockets where the relative positioning of the 4, 
7, and/or 10 Angstroms sets of amino acids are substantially conserved. Again, it is noted that 
it may be desirable to form variants where 1, 2, 3, 4 or more of the residues set forth in Tables 

2, 3, and 4 are varied in order to evaluate the roles these amino acids play in the binding 
pockets. Accordingly, any set of structure coordinates for a protein from any source having a 
root mean square deviation equal to or less than the RMSD value specified in Columns 3, 4 or 
5 of Table 1 when compared to the structure coordinates of Figure 3, the root mean square 
deviation being calculated such that the portion of amino acid residues specified in Column 2 
of Table 1 of each set of structure coordinates are superimposed and the root mean square 
deviation is based only on those amino acid residues in the structure coordinates that are also 
present in the portion of the protein specified in Column 1 of Table 1. 

[0144] As noted previously, the root mean square deviation is intended to be limited to only 
those alpha-carbon atoms or non-hydrogen atoms of amino acid residues that are common to 
both the protein fragments represented in Figure 3, and the protein whose structure coordinates 
are being compared to the coordinates shown in Figure 3, since the sequence of the protein may 
be varied somewhat. 

[0145] Accordingly, in various embodiments, the invention relates to data, computer 
readable media comprising data, and uses of the data where the data comprises structure 
coordinates that have a root mean square deviation equal to or less than the RMSD value 
specified in Columns 3, 4 or 5 of Table 1 when compared to the structure coordinates of Figure 

3, the root mean square deviation being calculated such that the portion of amino acid residues 
specified in Column 2 of Table 1 of each set of structure coordinates are superimposed and the 
root mean square deviation is based only on those amino acid residues in the structure 
coordinates that are also present in the portion of the protein specified in Column 1 of Table 1. 
[0146] As noted above, there are many different ways to express the surface contours of 
HDAC-2 structure other than by using the structure coordinates provided in Figure 3. 
Accordingly, it is noted that the present invention is also directed to any data, computer 
readable media comprising data, and uses of the data where the data defines a computer model 
for a protein binding pocket, at least a portion of the computer model having a surface contour 
that has a root mean square deviation equal to or less than a given RMSD value specified in 



35 



Express Mailing No. EV418184186US 



PATENT 



SYR-HDAC-5004-C1 



Columns 3, 4 or 5 of Table 1 when the coordinates used to compute the surface contour are 
compared to the structure coordinates of Figure 3, wherein (a) the root mean square deviation is 
calculated by the calculation method set forth herein, (b) the portion of amino acid residues 
associated with the given RMSD value in Table 1 (specified in Column 2 of Table 1) are 
superimposed according to the RMSD calculation, and (c) the root mean square deviation is 
calculated based only on those amino acid residues present in both the protein being modeled 
and the portion of the protein associated with the given RMSD in Table 1 (specified in Column 
1 of Table 1). 

[0147] It is again noted that the root mean square deviation calculation may optionally be 
based on a comparison of non-hydrogen atoms. Also, the root mean square deviation of alphd- 
carbon atoms or non-hydrogen atoms may optionally be less than 2.7 A, 2.5 A, 2.0 A, 1.5 A, 1 
A, 0.5 A, or less. 

[0148] It will be readily apparent to those of skill in the art that the numbering of amino 
acids in other isoforms of HDAC-2 may be different than that set forth for HDAC-2. 
Corresponding amino acids in other isoforms of HDAC-2 are easily identified by visual 
inspection of the amino acid sequences or by using commercially available homology software 
programs, as further described below. 

7. System For Displaying the Three Dimensional Structure of HDAC-2 
[0149] The present invention is also directed to machine-readable data storage media having 
data storage material encoded with machine-readable data that comprises structure coordinates 
for HDAC-2. The present invention is also directed to a machine readable data storage media 
having data storage material encoded with machine readable data, which, when read by an 
appropriate machine, can display a three dimensional representation of a structure of HDAC-2. 
[01501 All or a portion of the HDAC-2 coordinate data shown in Figure 3, when used in 
conjunction with a computer programmed with software to translate those coordinates into the 
three-dimensional structure of HDAC-2 may be used for a variety of purposes, especially for 
purposes relating to drug discovery. Software for generating three-dimensional graphical 
representations are known and commercially available. The ready use of the coordinate data 
requires that it be stored in a computer-readable format. Thus, in accordance with the present 
invention, data capable of being displayed as the three-dimensional structure of HDAC-2 
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and/or portions thereof and/or their structurally similar variants may be stored in a machine- 
readable storage medium, which is capable of displaying a graphical three-dimensional 
representation of the structure. 

[0151] For example, in various embodiments, a computer is provided for producing a three- 
dimensional representation of at least an HDAC-2-like binding pocket, the computer 
comprising: 

machine readable data storage medium comprising a data storage material encoded with 
machine-readable data, the machine readable data comprising structure coordinates that have a 
root mean square deviation equal to or less than the RMSD value specified in Columns 3, 4 or 
5 of Table 1 when compared to the structure coordinates of Figure 3, the root mean square 
deviation being calculated such that the portion of amino acid residues specified in Column 2 
of Table 1 of each set of structure coordinates are superimposed and the root mean square 
deviation is based only on those amino acid residues in the structure coordinates that are also 
present in the portion of the protein specified in Column 1 of Table 1; 

a working memory for storing instructions for processing the machine-readable data; 

a central-processing unit coupled to the working memory and to the machine-readable 
data storage medium, for processing the machine-readable data into the three-dimensional 
representation; and 

an output hardware coupled to the central processing unit, for receiving the three 
dimensional representation. 

[0152] Another embodiment of this invention provides a machine-readable data storage 
medium, comprising a data storage material encoded with machine readable data which, when 
used by a machine programmed with instructions for using said data, displays a graphical three- 
dimensional representation comprising HDAC-2 or a portion or variant thereof. 
[0153] In various variations, the machine readable data comprises data for representing a 
protein based on structure coordinates where the structure coordinates have a root mean square 
deviation equal to or less than the RMSD value specified in Columns 3, 4 or 5 of Table 1 when 
compared to the structure coordinates of Figure 3, the root mean square deviation being 
calculated such that the portion of amino acid residues specified in Column 2 of Table 1 of 
each set of structure coordinates are superimposed and the root mean square deviation is based 
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only on those amino acid residues in the structure coordinates that are also present in the 
portion of the protein specified in Column 1 of Table 1. 

[0154] According to another embodiment, the machine-readable data storage medium 
comprises a data storage material encoded with a first set of machine readable data which 
comprises the Fourier transform of structure coordinates that have a root mean square deviation 
equal to or less than the RMSD value specified in Columns 3, 4 or 5 of Table 1 when compared 
to the structure coordinates of Figure 3, the root mean square deviation being calculated such 
that the portion of amino acid residues specified in Column 2 of Table 1 of each set of structure 
coordinates are superimposed and the root mean square deviation is based only on those amino 
acid residues in the structure coordinates that are also present in the portion of the protein 
specified in Column 1 of Table 1, and which, when using a machine programmed with 
instructions for using said data, can be combined with a second set of machine readable data 
comprising the X-ray diffraction pattern of another molecule or molecular complex to 
determine at least a portion of the structure coordinates corresponding to the second set of 
machine readable data. For example, the Fourier transform of the structure coordinates set 
forth in Figure 3 may be used to determine at least a portion of the structure coordinates of 
other HDAC-2-like enzymes, and isoforms of HDAC-2. 

[0155] Optionally, a computer system is provided in combination with the machine-readable 
data storage medium provided herein. In one embodiment, the computer system comprises a 
working memory for storing instructions for processing the machine-readable data; a 
processing unit coupled to the working memory and to the machine-readable data storage 
medium, for processing the machine-readable data into the three-dimensional representation; 
and an output hardware coupled to the processing unit, for receiving the three-dimensional 
representation. 

[0156] Figure 6 illustrates an example of a computer system that may be used in 
combination with storage media according to the present invention. As illustrated, the 
computer system 10 includes a computer 1 1 comprising a central processing unit ("CPU") 20, a 
working memory 22 which may be, e.g., RAM (random-access memory) or "core" memory, 
mass storage memory 24 (such as one or more disk drives or CD-ROM drives), one or more 
cathode-ray tube ("CRT") display terminals 26, one or more keyboards 28, one or more input 



38 



Express Mailing No. EV418184186US 



PATENT 



SYR-HDAC-5004-C1 



lines 30, and one or more output lines 40, all of which are interconnected by a conventional bi- 
directional system bus 50. 

[0157] Input hardware 36, coupled to computer 1 1 by input lines 30, may be implemented in 
a variety of ways. For example, machine-readable data of this invention may be inputted via 
the use of a modem or modems 32 connected by a telephone line or dedicated data line 34. 
Alternatively or additionally, the input hardware 36 may comprise CD-ROM drives or disk 
drives 24. In conjunction with display terminal 26, keyboard 28 may also be used as an input 
device. 

[0158] Conventional devices, coupled to computer 11 by output lines 40, may similarly 
implement output hardware 46. By way of example, output hardware 46 may include CRT 
display terminal 26 for displaying a graphical representation of a binding pocket of this 
invention using a program such as MOE as described herein. Output hardware might also 
include a printer 42, so that hard copy output may be produced, or a disk drive 24, to store 
system output for later use. 

[0159] In operation, CPU 20 coordinates the use of the various input and output devices 36, 
46 coordinates data accesses from mass storage 24 and accesses to and from working memory 
22; and determines the sequence of data processing steps. A number of programs may be used 
to process the machine-readable data of this invention. Such programs are discussed in 
reference to using the three dimensional structure of HDAC-2 described herein. 
[0160] The storage medium encoded with machine-readable data according to the present 
invention can be any conventional data storage device known in the art. For example, the 
storage medium can be a conventional floppy diskette or hard disk. The storage medium can 
also be an optically readable data storage medium, such as a CD-ROM or a DVD-ROM, or a 
rewritable medium such as a magneto-optical disk that is optically readable and magneto- 
optically writable. 

8. Uses of the Three Dimensional Structure of HDAC-2 
[0161] The three-dimensional crystal structure of the present invention may be used to 
identify HDAC-2 binding sites, be used as a molecular replacement model to solve the 
structure of unknown crystallized proteins, to design mutants having desirable binding 
properties, and ultimately, to design, characterize, and identify entities capable of interacting 
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with HDAC-2 and other structurally similar proteins as well as other uses that would be 
recognized by one of ordinary skill in the art. Such entities may be chemical entities or 
proteins. The term "chemical entity," as used herein, refers to chemical compounds, complexes 
of at least two chemical compounds, and fragments of such compounds. 
[0162] The HDAC-2 structure coordinates provided herein are useful for screening and 
identifying drugs that inhibit HDAC-2 and other structurally similar proteins. For example, the 
structure encoded by the data may be computationally evaluated for its ability to associate with 
putative substrates or ligands. Such compounds that associate with HDAC-2 may inhibit 
HDAC-2, and are potential drug candidates. Additionally or alternatively, the structure 
encoded by the data may be displayed in a graphical three-dimensional representation on a 
computer screen. This allows visual inspection of the structure, as well as visual inspection of 
the structure's association with the compounds. 

[0163] Thus, according to another embodiment of the present invention, a method is 
provided for evaluating the potential of an entity to associate with HDAC-2 or a fragment or 
variant thereof by using all or a portion of the structure coordinates provided in Figure 3 or 
functional equivalents thereof. A method is also provided for evaluating the potential of an 
entity to associate with HDAC-2 or a fragment or variant thereof by using structure coordinates 
similar to all or a portion of the structure coordinates provided in Figure 3 or functional 
equivalents thereof. 

[0164] The method may optionally comprise the steps of: creating a computer model of all 
or a portion of a protein structure (e.g., a binding pocket) using structure coordinates according 
to the present invention; performing a fitting operation between the entity and the computer 
model; and analyzing the results of the fitting operation to quantify the association between the 
entity and the model. The portion of the protein structure used optionally comprises all of the 
amino acids listed in Tables 2, 3, and 4 that are present in the structure coordinates being used. 
[0165] It is noted that the computer model may not necessarily directly use the structure 
coordinates. Rather, a computer model can be formed that defines a surface contour that is the 
same or similar to the surface contour defined by the structure coordinates. 
[0166] The structure coordinates provided herein can also be utilized in a method for 
identifying a ligand (e.g., entities capable of associating with a protein) of a protein comprising 
an HDAC-2-like binding pocket. One embodiment of the method comprises: using all or a 
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portion of the structure coordinates provided herein to generate a three-dimensional structure of 
an HDAC-2-like binding pocket; employing the three-dimensional structure to design or select 
a potential ligand; synthesizing the potential ligand; and contacting the synthesized potential 
ligand with a protein comprising an HDAC-2-like binding pocket to determine the ability of the 
potential ligand to interact with the protein. According to this method, the structure coordinates 
used may have a root mean square deviation equal to or less than the RMSD values specified in 
Columns 3, 4 or 5 of Table 1 when compared to the structure coordinates of Figure 3 according 
to the RMSD calculation method set forth herein, provided that the portion of amino acid 
residues specified in Column 2 of Table 1 of each set of structure coordinates are superimposed 
and the root mean square deviation is calculated based only on those amino acid residues in the 
structure coordinates that are also present in the portion of the protein specified in Column 1 of 
Table 1. The portion of the protein structure used optionally comprises all of the amino acids 
listed in Tables 2, 3, and 4 that are present. 

[0167] As noted previously, the three-dimensional structure of an HDAC-2-like binding 
pocket need not be generated directly from structure coordinates. Rather, a computer model 
can be formed that defines a surface contour that is the same or similar to the surface contour 
defined by the structure coordinates. 

[0168] A method is also provided for evaluating the ability of an entity, such as a compound 
or a protein to associate with an HDAC-2-like binding pocket, the method comprising: 
constructing a computer model of a binding pocket defined by structure coordinates that have a 
root mean square deviation equal to or less than the RMSD value specified in Columns 3, 4 or 
5 of Table 1 when compared to the structure coordinates of Figure 3, the root mean square 
deviation being calculated such that the portion of amino acid residues specified in Column 2 
of Table 1 of each set of structure coordinates are superimposed and the root mean square 
deviation is based only on those amino acid residues in the structure coordinates that are also 
present in the portion of the protein specified in Column 1 of Table 1; selecting an entity to be 
evaluated by a method selected from the group consisting of (i) assembling molecular 
fragments into the entity, (ii) selecting an entity from a small molecule database, (iii) de novo 
ligand design of the entity, and (iv) modifying a known ligand for HDAC-2, or a portion 
thereof; performing a fitting program operation between computer models of the entity to be 
evaluated and the binding pocket in order to provide an energy-minimized configuration of the 
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entity in the binding pocket; and evaluating the results of the fitting operation to quantify the 
association between the entity and the binding pocket model in order to evaluate the ability of 
the entity to associate with the binding pocket. 

[0169] The computer model of a binding pocket used in this embodiment need not be 
generated directly from structure coordinates. Rather, a computer model can be formed that 
defines a surface contour that is the same or similar to the surface contour defined by the 
structure coordinates. 

[0170] Also according to the method, the method may further include synthesizing the 
entity; and contacting a protein having an HDAC-2-like binding pocket with the synthesized 
entity. 

[0171] With the structure provided herein, the present invention for the first time permits the 
use of molecular design techniques to identify, select or design potential inhibitors of HDAC-2, 
based on the structure of an HDA2-like binding pocket. Such a predictive model is valuable in 
light of the high costs associated with the preparation and testing of the many diverse 
compounds that may possibly bind to the HDAC-2 protein. 

[0172] According to this invention, a potential HDAC-2 inhibitor may now be evaluated for 
its ability to bind an HDAC-2-like binding pocket prior to its actual synthesis and testing. If a 
proposed entity is predicted to have insufficient interaction or association with the binding 
pocket, preparation and testing of the entity can be obviated. However, if the computer 
modeling indicates a strong interaction, the entity may then be obtained and tested for its ability 
to bind. 

[0173] A potential inhibitor of an HDAC-2-like binding pocket may be computationally 
evaluated using a series of steps in which chemical entities or fragments are screened and 
selected for their ability to associate with the HDAC-2-like binding pockets. 
[0174] One skilled in the art may use one of several methods to screen entities (whether 
chemical or protein) for their ability to associate with an HDAC-2-like binding pocket. This 
process may begin by visual inspection of, for example, an HDAC-2-like binding pocket on a 
computer screen based on the HDAC-2 structure coordinates in Figure 3 or other coordinates 
which define a similar shape generated from the machine-readable storage medium. Selected 
fragments or chemical entities may then be positioned in a variety of orientations, or docked, 
within that binding pocket as defined above. Docking may be accomplished using software 
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such as Quanta and Sybyl, followed by energy minimization and molecular dynamics with 
standard molecular mechanics force fields, such as CHARMM and AMBER. 
[0175] Specialized computer programs may also assist in the process of selecting entities. 
These include: GRID (P. J. Goodford, M A Computational Procedure for Determining 
Energetically Favorable Binding Sites on Biologically Important Macromolecules", J. Med. 
Chem., 28, pp. 849-857 (1985)). GRID is available from Oxford University, Oxford, UK; 
MCSS (A. Miranker et al., "Functionality Maps of Binding Sites: A Multiple Copy 
Simultaneous Search Method." Proteins: Structure, Function and Genetics, 11, pp. 29-34 
(1991)). MCSS is available from Molecular Simulations, San Diego, Calif.; AUTODOCK (D. 
S. Goodsell et al., "Automated Docking of Substrates to Proteins by Simulated Annealing", 
Proteins: Structure, Function, and Genetics, 8, pp. 195-202 (1990)). AUTODOCK is available 
from Scripps Research Institute, La Jolla, Calif.; & DOCK (I. D. Kuntz et al., "A Geometric 
Approach to Macromolecule-Ligand Interactions", J. Mol. Biol., 161, pp. 269-288 (1982)). 
DOCK is available from University of California, San Francisco, Calif. 

[0176] Once suitable entities have been selected, they can be designed or assembled. 
Assembly may be preceded by visual inspection of the relationship of the fragments to each 
other on the three-dimensional image displayed on a computer screen in relation to the structure 
coordinates of HDAC-2. This may then be followed by manual model building using software 
such as MOE, QUANTA or Sybyl [Tripos Associates, St. Louis, Mo]. 

[0177] Useful programs to aid one of skill in the art in connecting the individual chemical 
entities or fragments include: CAVEAT (P. A. Bartlett et al, "CAVEAT: A Program to 
Facilitate the Structure-Derived Design of Biologically Active Molecules", in "Molecular 
Recognition in Chemical and Biological Problems", Special Pub., Royal Chem. Soc, 78, pp. 
182-196 (1989); G. Lauri and P. A. Bartlett, "CAVEAT: a Program to Facilitate the Design of 
Organic Molecules", J. Comput. Aided Mol. Des., 8, pp. 51-66 (1994)). CAVEAT is available 
from the University of California, Berkeley, Calif.; 3D Database systems such as ISIS (MDL 
Information Systems, San Leandro, Calif.). This area is reviewed in Y. C. Martin, "3D 
Database Searching in Drug Design", J. Med. Chem., 35, pp. 2145-2154 (1992); HOOK (M. B. 
Eisen et al, "HOOK: A Program for Finding Novel Molecular Architectures that Satisfy the 
Chemical and Steric Requirements of a Macromolecule Binding Site", Proteins: Struct., Funct., 
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Genet., 19, pp. 199-221 (1994). HOOK is available from Molecular Simulations, San Diego, 
Calif. 

[0178] Instead of proceeding to build an inhibitor of an HDAC-2-like binding pocket in a 
step-wise fashion one fragment or entity at a time as described above, inhibitory or other 
HDAC-2 binding compounds may be designed as a whole or "de novo" using either an empty 
binding site or optionally including some portion(s) of a known inhibitor(s). There are many de 
novo ligand design methods including: LUDI (H.-J. Bohm, "The Computer Program LUDI: A 
New Method for the De Novo Design of Enzyme Inhibitors", J. Comp. Aid. Molec. Design, 6, 
pp. 61-78 (1992)). LUDI is available from Molecular Simulations Incorporated, San Diego, 
Calif.; LEGEND (Y. Nishibata et al., Tetrahedron, 47, p. 8985 (1991)). LEGEND is available 
from Molecular Simulations Incorporated, San Diego, Calif.; LEAPFROG (available from 
Tripos Associates, St. Louis, Mo.); & SPROUT (V. Gillet et al, "SPROUT: A Program for 
Structure Generation)", J. Comput. Aided Mol. Design, 7, pp. 127-153 (1993)). SPROUT is 
available from the University of Leeds, UK. 

[0179] Other molecular modeling techniques may also be employed in accordance with this 
invention (see, e.g., Cohen et al., "Molecular Modeling Software and Methods for Medicinal 
Chemistry", J. Med. Chem., 33, pp. 883-894 (1990); see also, M. A. Navia and M. A. Murcko, 
"The Use of Structural Information in Drug Design", Current Opinions in Structural Biology, 2, 
pp. 202-210 (1992); L. M. Balbes et al., "A Perspective of Modern Methods in Computer- 
Aided Drug Design", in Reviews in Computational Chemistry, Vol. 5, K. B. Lipkowitz and D. 
B. Boyd, Eds., VCH, New York, pp. 337-380 (1994); see also, W. C. Guida, "Software For 
Structure-Based Drug Design", Curr. Opin. Struct. Biology, 4, pp. 777-781 (1994)). 
[0180] Once an entity has been designed or selected, for example, by the above methods, the 
efficiency with which that entity may bind to an HDAC-2 binding pocket may be tested and 
optimized by computational evaluation. For example, an effective HDAC-2 binding pocket 
inhibitor preferably demonstrates a relatively small difference in energy between its bound and 
free states (i.e., a small deformation energy of binding). Thus, the most efficient HDAC-2 
binding pocket inhibitors should preferably be designed with deformation energy of binding of 
not greater than about 10 kcal/mole, and more preferably, not greater than 7 kcal/mole. 
HDAC-2 binding pocket inhibitors may interact with the binding pocket in more than one of 
multiple conformations that are similar in overall binding energy. In those cases, the 
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deformation energy of binding is taken to be the difference between the energy of the free 
entity and the average energy of the conformations observed when the inhibitor binds to the 
protein. 

[0181] An entity designed or selected as binding to an HDAC-2 binding pocket may be 
further computationally optimized so that in its bound state it would preferably lack repulsive 
electrostatic interaction with the target enzyme and with the surrounding water molecules. 
Such non-complementary electrostatic interactions include repulsive charge-charge, dipole- 
dipole and charge-dipole interactions. 

[0182] Specific computer software is available in the art* to evaluate compound deformation 
energy and electrostatic interactions. Examples of programs designed for such uses include: 
Gaussian 94, revision C (M. J. Frisch, Gaussian, Inc., Pittsburgh, Pa. COPYRGT.1995); 
AMBER, version 4.1 (P. A. Kollman, University of California at San Francisco, COPYRGT 
1995); QUANTA/CHARMM (Molecular Simulations, Inc., San Diego, Calif. 
COPYRGT.1995); Insight n/Discover (Molecular Simulations, Inc., San Diego, Calif. 
COPYRGT.1995); DelPhi (Molecular Simulations, Inc., San Diego, Calif. COPYRGT.1995); 
and AMSOL (Quantum Chemistry Program Exchange, Indiana University). These programs 
may be implemented, for instance, using a Silicon Graphics workstation such as an 
Indigo.sup.2 with "IMPACT" graphics. Other hardware systems and software packages will be 
known to those skilled in the art. 

[0183] Another approach provided by this invention, is the computational screening of small 
molecule databases for chemical entities or compounds that can bind in whole, or in part, to an 
HDAC-2 binding pocket. In this screening, the quality of fit of such entities to the binding site 
may be judged either by shape complementarities or by estimated interaction energy [E. C. 
Meng et al., J. Comp. Chem., 13, 505-524 (1992)]. 

[0184] According to another embodiment, the invention provides compounds that associate 
with an HDAC-2-like binding pocket produced or identified by various methods set forth 
above. 

[0185] The structure coordinates set forth in Figure 3 can also be used to aid in obtaining 
structural information about another crystallized molecule or molecular complex. This may be 
achieved by any of a number of well-known techniques, including molecular replacement. 
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[0186] For example, a method is also provided for utilizing molecular replacement to obtain 
structural information about a protein whose structure is unknown comprising the steps of: 
generating an X-ray diffraction pattern of a crystal of the protein whose structure is unknown; 
generating a three-dimensional electron density map of the protein whose structure is unknown 
from the X-ray diffraction pattern by using at least a portion of the structure coordinates set 
forth in Figure 3 as a molecular replacement model. 

[0187] By using molecular replacement, all or part of the structure coordinates of the 
HDAC-2 provided by this invention (and set forth in Figure 3) can be used to determine the 
structure of another crystallized molecule or molecular complex more quickly and efficiently 
than attempting an ab initio structure determination. One particular use includes use with other 
structurally similar proteins. Molecular replacement provides an accurate estimation of the 
phases for an unknown structure. Phases are a factor in equations used to solve crystal 
structures that cannot be determined directly. Obtaining accurate values for the phases, by 
methods other than molecular replacement, is a time-consuming process that involves iterative 
cycles of approximations and refinements and greatly hinders the solution of crystal structures. 
However, when the crystal structure of a protein containing at least a homologous portion has 
been solved, the phases from the known structure provide a satisfactory estimate of the phases 
for the unknown structure. 

[0188] Thus, this method involves generating a preliminary model of a molecule or 
molecular complex whose structure coordinates are unknown, by orienting and positioning the 
relevant portion of HDAC-2 according to Figure 3 within the unit cell of the crystal of the 
unknown molecule or molecular complex so as best to account for the observed X-ray 
diffraction pattern of the crystal of the molecule or molecular complex whose structure is 
unknown. Phases can then be calculated from this model and combined with the observed X- 
ray diffraction pattern amplitudes to generate an electron density map of the structure whose 
coordinates are unknown. This, in turn, can be subjected to any well-known model building 
and structure refinement techniques to provide a final, accurate structure of the unknown 
crystallized molecule or molecular complex [E. Lattman, "Use of the Rotation and Translation 
Functions", in Meth. Enzymol., 115, pp. 55-77 (1985); M. G. Rossmann, ed., "The Molecular 
Replacement Method", Int. Sci. Rev. Sen, No. 13, Gordon & Breach, New York (1972)]. 
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[0189] The structure of any portion of any crystallized molecule or molecular complex that 
is sufficiently homologous to any portion of HDAC-2 can be resolved by this method. 
[0190] In one embodiment, the method of molecular replacement is utilized to obtain 
structural information about the present invention and any other HDAC-2-like molecule. The 
structure coordinates of HDAC-2, as provided by this invention, are particularly useful in 
solving the structure of other isoforms of HDAC-2 or HDAC-2 complexes. 
[0191] The structure coordinates of HDAC-2 as provided by this invention are useful in 
solving the structure of HDAC-2 variants that have amino acid substitutions, additions and/or 
deletions (referred to collectively as "HDAC-2 mutants", as compared to naturally occurring 
HDAC-2). These HDAC-2 mutants may optionally be crystallized in co-complex with a 
ligand, such as an inhibitor, substrate analogue or a suicide substrate. The crystal structures of 
a series of such complexes may then be solved by molecular replacement and compared with 
that of HDAC-2. Potential sites for modification within the various binding sites of the enzyme 
may thus be identified. This information provides an additional tool for determining the most 
efficient binding interactions such as, for example, increased hydrophobic interactions, between 
HDAC-2 and a ligand. It is noted that the ligand may be the protein's natural ligand or may be 
a potential agonist or antagonist of a protein. 

[0192] All of the complexes referred to above may be studied using well-known X-ray 
diffraction techniques and may be refined versus 1.5-3 A resolution X-ray data to an R value of 
about 0.22 or less using computer software, such as X-PLOR [Yale University, 
COPYRIGHT. 1992, distributed by Molecular Simulations, Inc.; see, e.g., Blundell & Johnson, 
supra; Meth. Enzymol., Vol. 114 & 115, H. W. Wyckoff et al., eds., Academic Press (1985)]. 
This information may thus be used to optimize known HDAC-2 inhibitors, and more 
importantly, to design new HDAC-2 inhibitors. 

[0193] The structure coordinates described above may also be used to derive the dihedral 
angles, phi and psi, that define the conformation of the amino acids in the protein backbone. 
As will be understood by those skilled in the art, the phi n angle refers to the rotation around the 
bond between the alpha-carbon and the nitrogen, and the psi n angle refers to the rotation around 
the bond between the carbonyl carbon and the alpha-carbon. The subscript "n" identifies the 
amino acid whose conformation is being described [for a general reference, see Blundell and 
Johnson, Protein Crystallography, Academic Press, London, 1976]. 
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9. Uses of the Crystal and Diffraction Pattern of HDAC-2 
[0194] Crystals, crystallization conditions and the diffraction pattern of HDAC-2 that can be 
generated from the crystals also have a range of uses. One particular use relates to screening 
entities that are not known ligands of HDAC-2 for their ability to bind to HDAC-2. For 
example, with the availability of crystallization conditions, crystals and diffraction patterns of 
HDAC-2 provided according to the present invention, it is possible to take a crystal of HDAC- 
2; expose the crystal to one or more entities that may be a ligand of HDAC-2; and determine 
whether a ligand/HDAC-2 complex is formed. The crystals of HDAC-2 may be exposed to 
potential ligands by various methods, including but not limited to, soaking a crystal in a 
solution of one or more potential ligands or co-crystallizing HDAC-2 in the presence of one or 
more potential ligands. Given the structure coordinates provided herein, once a ligand complex 
is formed, the structure coordinates can be used as a model in molecular replacement in order 
to determine the structure of the ligand complex. 

[0195] Once one or more ligands are identified, structural information from the ligand/ 
HDAC-2 complex(es) may be used to design new ligands that bind tighter, bind more 
specifically, have better biological activity or have better safety profiles than known ligands. 
[0196] In one embodiment, a method is provided for identifying a ligand that binds to 
HDAC-2 comprising: (a) attempting to crystallize a protein that comprises a sequence wherein 
at least a portion of the sequence has 55%, 65%, 75%, 85%, 90%, 95%, 97%, 99% or greater 
identity with SEQ. ID No. 4 in the presence of one or more entities; (b) if crystals of the protein 
are obtained in step (a), obtaining an X-ray diffraction pattern of the protein crystal; and (c) 
determining whether a ligand/protein complex was formed by comparing an X-ray diffraction 
pattern of a crystal of the protein formed in the absence of the one or more entities to the crystal 
formed in the presence of the one or more entities. 

[0197] In one embodiment, a method is provided for identifying a ligand that binds to 
HDAC-2 comprising: (a) attempting to crystallize a protein that comprises a sequence wherein 
at least a portion of the sequence has 55%, 65%, 75%, 85%, 90%, 95%, 97%, 99% or greater 
identity with SEQ. ID No. 5 in the presence of one or more entities; (b) if crystals of the protein 
are obtained in step (a), obtaining an X-ray diffraction pattern of the protein crystal; and (c) 
determining whether a ligand/protein complex was formed by comparing an X-ray diffraction 
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pattern of a crystal of the protein formed in the absence of the one or more entities to the crystal 
formed in the presence of the one or more entities. 

[0198] In another embodiment, a method is provided for identifying a ligand that binds to 
HDAC-2 comprising: soaking a crystal of a protein wherein at least a portion of the protein has 
55%, 65%, 75%, 85%, 90%, 95%, 97%, 99% or greater identity with SEQ. ID No. 4 with one 
or more entities; determining whether a ligand/protein complex was formed by comparing an 
X-ray diffraction pattern of a crystal of the protein that has not been soaked with the one or 
more entities to the crystal that has been soaked with the one or more entities. 
[0199] In another embodiment, a method is provided for identifying a ligand that binds to 
HDAC-2 comprising: soaking a crystal of a protein wherein at least a portion of the protein has 
55%, 65%, 75%, 85%, 90%, 95%, 97%, 99% or greater identity with SEQ. ID No. 5 with one 
or more entities; determining whether a ligand/protein complex was formed by comparing an 
X-ray diffraction pattern of a crystal of the protein that has not been soaked with the one or 
more entities to the crystal that has been soaked with the one or more entities. 
[0200] Optionally, the method may further comprise converting the diffraction patterns into 
electron density maps using phases of the protein crystal and comparing the electron density 
maps. 

[0201] Libraries of "shape-diverse" compounds may optionally be used to allow direct 
identification of the ligand-receptor complex even when the ligand is exposed as part of a 
mixture. According to this variation, the- need for time-consuming de-convolution of a hit from 
the mixture is avoided. More specifically, the calculated electron density function reveals the 
binding event, identifies the bound compound and provides a detailed 3-D structure of the 
ligand-receptor complex. Once a hit is found, one may optionally also screen a number of 
analogs or derivatives of the hit for tighter binding or better biological activity by traditional 
screening methods. The hit and information about the structure of the target may also be used 
to develop analogs or derivatives with tighter binding or better biological activity. It is noted 
that the ligand-HDAC-2 complex may optionally be exposed to additional iterations of 
potential ligands so that two or more hits can be linked together to make a more potent ligand. 
Screening for potential ligands by co-crystallization and/or soaking is further described in U.S. 
Patent No. 6,297,021, which is incorporated herein by reference. 
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EXAMPLES 

Example 1. Expression, Purification and Crystallization of HDAC-2 (P2i Crystal Form) 
[0202] This example describes the expression, purification and crystallization of HDAC-2 
that resulted in a P2i crystal form. 

[0203] The portion of the gene encoding residues 1-488 (from SEQ. ID No. 1) which 
corresponds to the entire sequence of human HDAC-2 was amplified by PCR and cloned into 
the BamHI/Smal site of pFastbac (Invitrogen) with a 6-histidine tag at the C-terminus. This 
DNA sequence is presented in Figure 1 as SEQ. ID No. 2. 

[0204] Expression in this vector generated a fusion of HDAC-2 residues 1-488 with a C- 
terminal 6x-histidine tag, the amino acid sequence of which is shown in Figure 1 as SEQ. ID. 3. 
Recombinant baculoviruses incorporating the HDAC-2 constructs were generated by 
transposition using the Bac-to-Bac system (Invitrogen). High-titer viral stocks were generated 
by infection of Spodoptera frugiperda Sf9 cells and the expression of recombinant protein was 
carried out by infection of Sf9 cells (Invitrogen) in 10L Wave Bioreactors (Wave Biotech). 
Recombinant proteins were isolated from cellular extracts by passage over ProBond 
(InVitrogen) resin. Purified HDAC-2 protein samples were incubated in the presence of a 
known HDAC-2 inhibitor such as SAHA or TSA before subjecting samples to limited 
proteolysis utilizing cross linked enzyme crystals CLEC™-BL (Altus). The protein samples 
were incubated with CLEC and shaken for 90 minutes at 25°C to yield fragments of SEQ. ID 
No. 4. Cleavage reactions were terminated by centrifugation at 900g for 5 min and 
supernatants were tested for cleavage by mass spectroscopy. Samples were further purified 
over a POROS-S column (Applied Biosystems) followed by size exclusion chromatography 
through passage over BioSep S3000 resin (Phenomenex). The HDAC-2 protein purity as 
determined on denaturing SDS-PAGE gel was 90-95%. HDAC-2 was concentrated to a final 
concentration of 1 1 mg/ml and stored at 4°C in a buffer containing 25mM TRIS-HC1 pH 7.6, 
250mM NaCl and 0. 125 mM TCEP. 

[0205] HDAC-2 protein samples were incubated with ImM Benzamidine and 5 mM CaCb 
before setting up crystallization trials. Crystals were obtained after an extensive and broad 
screen of conditions, followed by optimization. 

[0206] Diffraction quality crystals were grown as in lOOnL sitting droplets using the vapor 
diffusion method. 50nL comprising the HDAC-2-inhibitor complex (15 mg/ml) was mixed 
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with 50nL from a reservoir solution (lOO^L) comprising 0.1M CHES pH=9.25, 17.5 mM 
Ammonium Sulfate and 18% PEG MME 2000. The resulting solution was incubated over a 
period of one week at 4°C. 

[0207] Crystals typically appeared after 24-48 hours and grew to a maximum size within 72 
hours. Single crystals were transferred, briefly, into a cryoprotecting solution containing the 
reservoir solution supplemented with 30% v/v ethylene glycol. Crystals were then flash frozen 
by immersion in liquid nitrogen and then stored under liquid nitrogen. 

[0208] A crystal of HDAC-2 produced as described is illustrated in Figure 2A. The protein 
crystal was found to have a crystal lattice in a P2i space group. The protein crystal may also 
have a crystal lattice having unit cell dimensions, +/- 5%, of a=79.9A, b=56.9A, c=95.2A, 
a=90°, (3=90.5°, and y=90°. Crystals of this crystal form were subjected to x-ray diffraction 
experiments in order to arrive at the structure coordinates reported herein. 

Example 2. Expression, purification and crystallization of HDAC-2 (P2i2i2t Crystal 
Form) 

[0209] This example describes the expression, purification and crystallization of HDAC-2 
that resulted in a P2\2\2\ crystal form. 

[0210] The portion of the gene encoding residues 1-488 (from SEQ. ID No. 1) which 
corresponds to the entire sequence of human HDAC-2 was amplified by PCR and cloned into 
the BamHI/Smal site of pFastbac (Invitrogen) with a 6-histidine tag at the C-terminus. This 
DNA sequence is presented in Figure 1 as SEQ. ID No. 2. 

[0211] Expression in this vector generated a fusion of HDAC-2 residues 1-488 with a C- 
terminal 6x-histidine tag, the amino acid sequence of which is shown in Figure 1 as SEQ. ID. 3. 
Recombinant baculoviruses incorporating the HDAC-2 constructs were generated by 
transposition using the Bac-to-Bac system (Invitrogen). High-titer viral stocks were generated 
by infection of Spodoptera frugiperda Sf9 cells and the expression of recombinant protein was 
carried out by infection of Sf9 cells (Invitrogen) in 10L Wave Bioreactors (Wave Biotech). 
Recombinant proteins were isolated from cellular extracts by passage over ProBond 
(InVitrogen) resin. Purified HDAC-2 protein samples were incubated in the presence of a 
known HDAC-2 inhibitor such as SAHA or TSA before subjecting samples to limited 
proteolysis utilizing Immobilized TPCK-Trypsin (Pierce). The protein samples were incubated 
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with Immobilized TPCK-Trypsin and shaken for 90minutes at 25°C to yield fragments of SEQ. 
ID No. 5. Cleavage reactions were terminated by centrifugation at 900g for 5 min and 
supernatants were tested for cleavage by mass spectroscopy. Samples were further purified 
through size exclusion chromatography by passage over BioSep S3000 resin (Phenomenex). 
The HDAC-2 protein purity as determined on denaturing SDS-PAGE gel was 90-95%. 
HDAC-2 was concentrated to a final concentration of 1 1 mg/ml and stored at 4°C in a buffer 
containing 25mM TRIS-HC1 pH 7.6, 250mM NaCl and 0. 125 mM TCEP. 
[0212] HDAC-2 protein samples were incubated 0.5mM Benzamidine and 5mM CaCl 2 
before setting up crystallization trials. Crystals were obtained after an extensive and broad 
screen of conditions, followed by optimization. 

[0213] Diffraction quality crystals were grown as in lOOnL sitting droplets using the vapor 
diffusion method. 50nL comprising the resulting HDAC-2 complex (11.0 mg/ml) was mixed 
with 50nL from a reservoir solution (lOO^iL) comprising 0.1M CHES pH=9.5 and 40% PEG 
600. The resulting solution was incubated over a period of one week at 20°C. Crystals 
typically appeared after 24-48 hours and grew to a maximum size within 72 hours. Single 
crystals were separated from their parent cluster (if necessary) and incubated in reservoir 
solution supplemented with 5 mM TS A for a period of 3 hours. Crystals were then flash frozen 
directly from reservoir solution by immersion in liquid nitrogen and then stored under liquid 
nitrogen. 

[0214] A crystal of HDAC-2 produced as described is illustrated in Figure 2B. The protein 
crystal was found to have a crystal lattice in a P2i2j2i space group. The protein crystal may 
also have a crystal lattice having unit cell dimensions, +/- 5%, of a=92.lA, b= 97.6A, 
c=138.9A, and a=p=y=90°. 

[0215] While the present invention is disclosed with reference to certain embodiments and 
examples detailed above, it is to be understood that these embodiments and examples are 
intended to be illustrative rather than limiting, as it is contemplated that modifications will 
readily occur to those skilled in the art, which modifications are intended to be within the scope 
of the invention and the appended claims. All patents, papers, and books cited in this 
application are incorporated herein in their entirety. 
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